[jira] [Commented] (HADOOP-16378) RawLocalFileStatus throws exception if a file is created and deleted quickly

2019-07-16 Thread K S (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886671#comment-16886671
 ] 

K S commented on HADOOP-16378:
--

Hey [~ste...@apache.org]. Sorry but I have unfortunately not been able to 
reproduce. I'm not sure if I can even post a stack trace, as it is part of my 
company's IP. I have already posted the reproduction steps in the description 
of this bug. Hope that helps.

> RawLocalFileStatus throws exception if a file is created and deleted quickly
> 
>
> Key: HADOOP-16378
> URL: https://issues.apache.org/jira/browse/HADOOP-16378
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.3.0
> Environment: Ubuntu 18.04, Hadoop 2.7.3 (Though this problem exists 
> on later versions of Hadoop as well), Java 8 ( + Java 11).
>Reporter: K S
>Priority: Critical
>
> Bug occurs when NFS creates temporary ".nfs*" files as part of file moves and 
> accesses. If this file is deleted very quickly after being created, a 
> RuntimeException is thrown. The root cause is in the loadPermissionInfo 
> method in org.apache.hadoop.fs.RawLocalFileSystem. To get the permission 
> info, it first does
>  
> {code:java}
> ls -ld{code}
>  and then attempts to get permissions info about each file. If a file 
> disappears between these two steps, an exception is thrown.
> *Reproduction Steps:*
> An isolated way to reproduce the bug is to run FileInputFormat.listStatus 
> over and over on the same dir that we’re creating those temp files in. On 
> Ubuntu or any other Linux-based system, this should fail intermittently
> *Fix:*
> One way in which we managed to fix this was to ignore the exception being 
> thrown in loadPemissionInfo() if the exit code is 1 or 2. Alternatively, it's 
> possible that turning "useDeprecatedFileStatus" off in RawLocalFileSystem 
> would fix this issue, though we never tested this, and the flag was 
> implemented to fix -HADOOP-9652-. Could also fix in conjunction with 
> HADOOP-8772.
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16378) RawLocalFileStatus throws exception if a file is created and deleted quickly

2019-06-28 Thread K S (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875103#comment-16875103
 ] 

K S commented on HADOOP-16378:
--

Sorry forgot to try and reproduce it. I will do it over the weekend

> RawLocalFileStatus throws exception if a file is created and deleted quickly
> 
>
> Key: HADOOP-16378
> URL: https://issues.apache.org/jira/browse/HADOOP-16378
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.3.0
> Environment: Ubuntu 18.04, Hadoop 2.7.3 (Though this problem exists 
> on later versions of Hadoop as well), Java 8 ( + Java 11).
>Reporter: K S
>Priority: Critical
>
> Bug occurs when NFS creates temporary ".nfs*" files as part of file moves and 
> accesses. If this file is deleted very quickly after being created, a 
> RuntimeException is thrown. The root cause is in the loadPermissionInfo 
> method in org.apache.hadoop.fs.RawLocalFileSystem. To get the permission 
> info, it first does
>  
> {code:java}
> ls -ld{code}
>  and then attempts to get permissions info about each file. If a file 
> disappears between these two steps, an exception is thrown.
> *Reproduction Steps:*
> An isolated way to reproduce the bug is to run FileInputFormat.listStatus 
> over and over on the same dir that we’re creating those temp files in. On 
> Ubuntu or any other Linux-based system, this should fail intermittently
> *Fix:*
> One way in which we managed to fix this was to ignore the exception being 
> thrown in loadPemissionInfo() if the exit code is 1 or 2. Alternatively, it's 
> possible that turning "useDeprecatedFileStatus" off in RawLocalFileSystem 
> would fix this issue, though we never tested this, and the flag was 
> implemented to fix -HADOOP-9652-. Could also fix in conjunction with 
> HADOOP-8772.
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16378) RawLocalFileStatus throws exception if a file is created and deleted quickly

2019-06-24 Thread K S (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871768#comment-16871768
 ] 

K S commented on HADOOP-16378:
--

Eh, it'll be a little difficult to reproduce. We discovered the error by 
mistake when running company software, and managed to reproduce it by running a 
set of programs and running a bash script to quickly generate and delete files 
that start with "." I will try to reproduce it tomorrow evening

> RawLocalFileStatus throws exception if a file is created and deleted quickly
> 
>
> Key: HADOOP-16378
> URL: https://issues.apache.org/jira/browse/HADOOP-16378
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.3.0
> Environment: Ubuntu 18.04, Hadoop 2.7.3 (Though this problem exists 
> on later versions of Hadoop as well), Java 8 ( + Java 11).
>Reporter: K S
>Priority: Critical
>
> Bug occurs when NFS creates temporary ".nfs*" files as part of file moves and 
> accesses. If this file is deleted very quickly after being created, a 
> RuntimeException is thrown. The root cause is in the loadPermissionInfo 
> method in org.apache.hadoop.fs.RawLocalFileSystem. To get the permission 
> info, it first does
>  
> {code:java}
> ls -ld{code}
>  and then attempts to get permissions info about each file. If a file 
> disappears between these two steps, an exception is thrown.
> *Reproduction Steps:*
> An isolated way to reproduce the bug is to run FileInputFormat.listStatus 
> over and over on the same dir that we’re creating those temp files in. On 
> Ubuntu or any other Linux-based system, this should fail intermittently
> *Fix:*
> One way in which we managed to fix this was to ignore the exception being 
> thrown in loadPemissionInfo() if the exit code is 1 or 2. Alternatively, it's 
> possible that turning "useDeprecatedFileStatus" off in RawLocalFileSystem 
> would fix this issue, though we never tested this, and the flag was 
> implemented to fix -HADOOP-9652-. Could also fix in conjunction with 
> HADOOP-8772.
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16378) RawLocalFileStatus throws exception if a file is created and deleted quickly

2019-06-24 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871248#comment-16871248
 ] 

Steve Loughran commented on HADOOP-16378:
-

lets not worry about that. The main thing is not to suffer here.

what's the full stack trace?

> RawLocalFileStatus throws exception if a file is created and deleted quickly
> 
>
> Key: HADOOP-16378
> URL: https://issues.apache.org/jira/browse/HADOOP-16378
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.3.0
> Environment: Ubuntu 18.04, Hadoop 2.7.3 (Though this problem exists 
> on later versions of Hadoop as well), Java 8 ( + Java 11).
>Reporter: K S
>Priority: Critical
>
> Bug occurs when NFS creates temporary ".nfs*" files as part of file moves and 
> accesses. If this file is deleted very quickly after being created, a 
> RuntimeException is thrown. The root cause is in the loadPermissionInfo 
> method in org.apache.hadoop.fs.RawLocalFileSystem. To get the permission 
> info, it first does
>  
> {code:java}
> ls -ld{code}
>  and then attempts to get permissions info about each file. If a file 
> disappears between these two steps, an exception is thrown.
> *Reproduction Steps:*
> An isolated way to reproduce the bug is to run FileInputFormat.listStatus 
> over and over on the same dir that we’re creating those temp files in. On 
> Ubuntu or any other Linux-based system, this should fail intermittently
> *Fix:*
> One way in which we managed to fix this was to ignore the exception being 
> thrown in loadPemissionInfo() if the exit code is 1 or 2. Alternatively, it's 
> possible that turning "useDeprecatedFileStatus" off in RawLocalFileSystem 
> would fix this issue, though we never tested this, and the flag was 
> implemented to fix -HADOOP-9652-. Could also fix in conjunction with 
> HADOOP-8772.
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16378) RawLocalFileStatus throws exception if a file is created and deleted quickly

2019-06-20 Thread K S (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868812#comment-16868812
 ] 

K S commented on HADOOP-16378:
--

[~ste...@apache.org] I don't see that happening. The loadPermissionInfo method 
only uses the shell, unless you're looking somewhere else. Moving off of shell 
entirely would be a good idea, though im not familiar enough with this codebase 
to give any sort of advice. It would be good to have other developers weigh in 
here

> RawLocalFileStatus throws exception if a file is created and deleted quickly
> 
>
> Key: HADOOP-16378
> URL: https://issues.apache.org/jira/browse/HADOOP-16378
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.3.0
> Environment: Ubuntu 18.04, Hadoop 2.7.3 (Though this problem exists 
> on later versions of Hadoop as well), Java 8 ( + Java 11).
>Reporter: K S
>Priority: Critical
>
> Bug occurs when NFS creates temporary ".nfs*" files as part of file moves and 
> accesses. If this file is deleted very quickly after being created, a 
> RuntimeException is thrown. The root cause is in the loadPermissionInfo 
> method in org.apache.hadoop.fs.RawLocalFileSystem. To get the permission 
> info, it first does
>  
> {code:java}
> ls -ld{code}
>  and then attempts to get permissions info about each file. If a file 
> disappears between these two steps, an exception is thrown.
> *Reproduction Steps:*
> An isolated way to reproduce the bug is to run FileInputFormat.listStatus 
> over and over on the same dir that we’re creating those temp files in. On 
> Ubuntu or any other Linux-based system, this should fail intermittently
> *Fix:*
> One way in which we managed to fix this was to ignore the exception being 
> thrown in loadPemissionInfo() if the exit code is 1 or 2. Alternatively, it's 
> possible that turning "useDeprecatedFileStatus" off in RawLocalFileSystem 
> would fix this issue, though we never tested this, and the flag was 
> implemented to fix -HADOOP-9652-. Could also fix in conjunction with 
> HADOOP-8772.
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16378) RawLocalFileStatus throws exception if a file is created and deleted quickly

2019-06-18 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1689#comment-1689
 ] 

Steve Loughran commented on HADOOP-16378:
-

I'd prefer moving off shell entirely and into the fs APIs, either java or 
hadoop native. Doesn't it already drop to some native lib if its available?

> RawLocalFileStatus throws exception if a file is created and deleted quickly
> 
>
> Key: HADOOP-16378
> URL: https://issues.apache.org/jira/browse/HADOOP-16378
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.3.0
> Environment: Ubuntu 18.04, Hadoop 2.7.3 (Though this problem exists 
> on later versions of Hadoop as well), Java 8 ( + Java 11).
>Reporter: K S
>Priority: Critical
>
> Bug occurs when NFS creates temporary ".nfs*" files as part of file moves and 
> accesses. If this file is deleted very quickly after being created, a 
> RuntimeException is thrown. The root cause is in the loadPermissionInfo 
> method in org.apache.hadoop.fs.RawLocalFileSystem. To get the permission 
> info, it first does
>  
> {code:java}
> ls -ld{code}
>  and then attempts to get permissions info about each file. If a file 
> disappears between these two steps, an exception is thrown.
> *Reproduction Steps:*
> An isolated way to reproduce the bug is to run FileInputFormat.listStatus 
> over and over on the same dir that we’re creating those temp files in. On 
> Ubuntu or any other Linux-based system, this should fail intermittently
> *Fix:*
> One way in which we managed to fix this was to ignore the exception being 
> thrown in loadPemissionInfo() if the exit code is 1 or 2. Alternatively, it's 
> possible that turning "useDeprecatedFileStatus" off in RawLocalFileSystem 
> would fix this issue, though we never tested this, and the flag was 
> implemented to fix -HADOOP-9652-. Could also fix in conjunction with 
> HADOOP-8772.
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org