[jira] [Work logged] (HADOOP-17293) S3A to always probe S3 in S3A getFileStatus on non-auth paths

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17293?focusedWorklogId=498530&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498530
 ]

ASF GitHub Bot logged work on HADOOP-17293:
---

Author: ASF GitHub Bot
Created on: 09/Oct/20 14:14
Start Date: 09/Oct/20 14:14
Worklog Time Spent: 10m 
  Work Description: bgaborg commented on a change in pull request #2361:
URL: https://github.com/apache/hadoop/pull/2361#discussion_r501626796



##
File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##
@@ -2956,55 +2956,30 @@ S3AFileStatus innerGetFileStatus(final Path f,
 // a file has been found in a non-auth path and the caller has not said
 // they only care about directories
 LOG.debug("Metadata for {} found in the non-auth metastore.", path);
-// If the timestamp of the pm is close to "now", we don't need to
-// bother with a check of S3. that means:
-// one of : status modtime is close to now,
-//  or pm.getLastUpdated() == now
-
-// get the time in which a status modtime is considered valid
-// in a non-auth metastore
-long validTime =
-ttlTimeProvider.getNow() - ttlTimeProvider.getMetadataTtl();
-final long msModTime = msStatus.getModificationTime();
-
-if (msModTime < validTime) {
-  LOG.debug("Metastore entry of {} is out of date, probing S3", path);
-  try {
-S3AFileStatus s3AFileStatus = s3GetFileStatus(path,
-key,
-probes,
-tombstones,
-needEmptyDirectoryFlag);
-// if the new status is more current than that in the metastore,
-// it means S3 has changed and the store needs updating
-final long s3ModTime = s3AFileStatus.getModificationTime();
-
-if (s3ModTime > msModTime) {
-  // there's new data in S3
-  LOG.debug("S3Guard metadata for {} is outdated;"
-  + " s3modtime={}; msModTime={} updating metastore",
-  path, s3ModTime, msModTime);
-  // add to S3Guard
-  S3Guard.putAndReturn(metadataStore, s3AFileStatus,
-  ttlTimeProvider);
-} else {
-  // the modtime of the data is the same as/older than the s3guard
-  // value either an old object has been found, or the existing one
-  // was retrieved in both cases -refresh the S3Guard entry so the
-  // record's TTL is updated.
-  S3Guard.refreshEntry(metadataStore, pm, s3AFileStatus,
-  ttlTimeProvider);
-}
-// return the value
-// note that the checks for empty dir status below can be skipped
-// because the call to s3GetFileStatus include the checks there
-return s3AFileStatus;
-  } catch (FileNotFoundException fne) {
-// the attempt to refresh the record failed because there was
-// no entry. Either it is a new file not visible, or it
-// has been deleted (and therefore S3Guard is out of sync with S3)
-LOG.warn("Failed to find file {}. Either it is not yet visible, or 
"
-+ "it has been deleted.", path);
+final long msModTime = pm.getFileStatus().getModificationTime();
+
+S3AFileStatus s3AFileStatus;
+try {
+  s3AFileStatus = s3GetFileStatus(path,
+  key,
+  probes,
+  tombstones,
+  needEmptyDirectoryFlag);
+} catch (FileNotFoundException fne) {
+  s3AFileStatus = null;

Review comment:
   a debug log could be done here about that we received an exception and 
that's why it's null

##
File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ARemoteFileChanged.java
##
@@ -326,7 +326,7 @@ protected Path path() throws IOException {
* @return a number >= 0.
*/
   private int getFileStatusHeadCount() {
-return authMode ? 0 : 0;

Review comment:
   interesting, why it was 0 : 0? I really haven't noticed it last time





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498530)
Time Spent: 3h 40m  (was: 3.5h)

>  S3A to always probe S3 in S3A getFileStatus on non-auth paths
> --
>
> Key: HADOOP-17293
> URL: https://issues.apache.org/jira/browse/HADO

[jira] [Work logged] (HADOOP-17293) S3A to always probe S3 in S3A getFileStatus on non-auth paths

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17293?focusedWorklogId=498474&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498474
 ]

ASF GitHub Bot logged work on HADOOP-17293:
---

Author: ASF GitHub Bot
Created on: 09/Oct/20 14:10
Start Date: 09/Oct/20 14:10
Worklog Time Spent: 10m 
  Work Description: bgaborg commented on pull request #2361:
URL: https://github.com/apache/hadoop/pull/2361#issuecomment-70714


   (also, tests run against Ireland was successful)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498474)
Time Spent: 3.5h  (was: 3h 20m)

>  S3A to always probe S3 in S3A getFileStatus on non-auth paths
> --
>
> Key: HADOOP-17293
> URL: https://issues.apache.org/jira/browse/HADOOP-17293
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> an incidental part of HDP-13230 was a fix to innerGetFileStatus, wherein 
> after a HEAD request we would update the DDB record, so resetting it's TTL
> Applications which did remote updates of buckets without going through 
> s3guard are now triggering failures in applications in the cluster when they 
> go to open the file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17293) S3A to always probe S3 in S3A getFileStatus on non-auth paths

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17293?focusedWorklogId=498393&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498393
 ]

ASF GitHub Bot logged work on HADOOP-17293:
---

Author: ASF GitHub Bot
Created on: 09/Oct/20 14:03
Start Date: 09/Oct/20 14:03
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on a change in pull request 
#2361:
URL: https://github.com/apache/hadoop/pull/2361#discussion_r501703520



##
File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ARemoteFileChanged.java
##
@@ -326,7 +326,7 @@ protected Path path() throws IOException {
* @return a number >= 0.
*/
   private int getFileStatusHeadCount() {
-return authMode ? 0 : 0;

Review comment:
   it was 1: 0 and th

##
File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ARemoteFileChanged.java
##
@@ -326,7 +326,7 @@ protected Path path() throws IOException {
* @return a number >= 0.
*/
   private int getFileStatusHeadCount() {
-return authMode ? 0 : 0;

Review comment:
   it was 1: 0 and th

##
File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ARemoteFileChanged.java
##
@@ -326,7 +326,7 @@ protected Path path() throws IOException {
* @return a number >= 0.
*/
   private int getFileStatusHeadCount() {
-return authMode ? 0 : 0;

Review comment:
   it was originally authMode ? 0 : 1; I'd switched it to 0 : 0 as things 
changed -but didn't actually deleting, so am reinstating the old code

##
File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##
@@ -2956,55 +2956,30 @@ S3AFileStatus innerGetFileStatus(final Path f,
 // a file has been found in a non-auth path and the caller has not said
 // they only care about directories
 LOG.debug("Metadata for {} found in the non-auth metastore.", path);
-// If the timestamp of the pm is close to "now", we don't need to
-// bother with a check of S3. that means:
-// one of : status modtime is close to now,
-//  or pm.getLastUpdated() == now
-
-// get the time in which a status modtime is considered valid
-// in a non-auth metastore
-long validTime =
-ttlTimeProvider.getNow() - ttlTimeProvider.getMetadataTtl();
-final long msModTime = msStatus.getModificationTime();
-
-if (msModTime < validTime) {
-  LOG.debug("Metastore entry of {} is out of date, probing S3", path);
-  try {
-S3AFileStatus s3AFileStatus = s3GetFileStatus(path,
-key,
-probes,
-tombstones,
-needEmptyDirectoryFlag);
-// if the new status is more current than that in the metastore,
-// it means S3 has changed and the store needs updating
-final long s3ModTime = s3AFileStatus.getModificationTime();
-
-if (s3ModTime > msModTime) {
-  // there's new data in S3
-  LOG.debug("S3Guard metadata for {} is outdated;"
-  + " s3modtime={}; msModTime={} updating metastore",
-  path, s3ModTime, msModTime);
-  // add to S3Guard
-  S3Guard.putAndReturn(metadataStore, s3AFileStatus,
-  ttlTimeProvider);
-} else {
-  // the modtime of the data is the same as/older than the s3guard
-  // value either an old object has been found, or the existing one
-  // was retrieved in both cases -refresh the S3Guard entry so the
-  // record's TTL is updated.
-  S3Guard.refreshEntry(metadataStore, pm, s3AFileStatus,
-  ttlTimeProvider);
-}
-// return the value
-// note that the checks for empty dir status below can be skipped
-// because the call to s3GetFileStatus include the checks there
-return s3AFileStatus;
-  } catch (FileNotFoundException fne) {
-// the attempt to refresh the record failed because there was
-// no entry. Either it is a new file not visible, or it
-// has been deleted (and therefore S3Guard is out of sync with S3)
-LOG.warn("Failed to find file {}. Either it is not yet visible, or 
"
-+ "it has been deleted.", path);
+final long msModTime = pm.getFileStatus().getModificationTime();
+
+S3AFileStatus s3AFileStatus;
+try {
+  s3AFileStatus = s3GetFileStatus(path,
+  key,
+  probes,
+  tombstones,
+  needEmptyDirectoryFlag);
+} catch (FileNotFoundException fne) {
+  s3AFileStatus = null;

Review comment:
  

[jira] [Work logged] (HADOOP-17293) S3A to always probe S3 in S3A getFileStatus on non-auth paths

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17293?focusedWorklogId=498287&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498287
 ]

ASF GitHub Bot logged work on HADOOP-17293:
---

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:54
Start Date: 09/Oct/20 13:54
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2361:
URL: https://github.com/apache/hadoop/pull/2361#issuecomment-705636416


   patch merged to 3.3.x and trunk, thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498287)
Time Spent: 3h 10m  (was: 3h)

>  S3A to always probe S3 in S3A getFileStatus on non-auth paths
> --
>
> Key: HADOOP-17293
> URL: https://issues.apache.org/jira/browse/HADOOP-17293
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> an incidental part of HDP-13230 was a fix to innerGetFileStatus, wherein 
> after a HEAD request we would update the DDB record, so resetting it's TTL
> Applications which did remote updates of buckets without going through 
> s3guard are now triggering failures in applications in the cluster when they 
> go to open the file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17293) S3A to always probe S3 in S3A getFileStatus on non-auth paths

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17293?focusedWorklogId=498149&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498149
 ]

ASF GitHub Bot logged work on HADOOP-17293:
---

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:44
Start Date: 09/Oct/20 13:44
Worklog Time Spent: 10m 
  Work Description: steveloughran closed pull request #2361:
URL: https://github.com/apache/hadoop/pull/2361


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498149)
Time Spent: 3h  (was: 2h 50m)

>  S3A to always probe S3 in S3A getFileStatus on non-auth paths
> --
>
> Key: HADOOP-17293
> URL: https://issues.apache.org/jira/browse/HADOOP-17293
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> an incidental part of HDP-13230 was a fix to innerGetFileStatus, wherein 
> after a HEAD request we would update the DDB record, so resetting it's TTL
> Applications which did remote updates of buckets without going through 
> s3guard are now triggering failures in applications in the cluster when they 
> go to open the file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17293) S3A to always probe S3 in S3A getFileStatus on non-auth paths

2020-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17293?focusedWorklogId=498085&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498085
 ]

ASF GitHub Bot logged work on HADOOP-17293:
---

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:38
Start Date: 09/Oct/20 13:38
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2361:
URL: https://github.com/apache/hadoop/pull/2361#issuecomment-705619305







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498085)
Time Spent: 2h 50m  (was: 2h 40m)

>  S3A to always probe S3 in S3A getFileStatus on non-auth paths
> --
>
> Key: HADOOP-17293
> URL: https://issues.apache.org/jira/browse/HADOOP-17293
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> an incidental part of HDP-13230 was a fix to innerGetFileStatus, wherein 
> after a HEAD request we would update the DDB record, so resetting it's TTL
> Applications which did remote updates of buckets without going through 
> s3guard are now triggering failures in applications in the cluster when they 
> go to open the file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17293) S3A to always probe S3 in S3A getFileStatus on non-auth paths

2020-10-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17293?focusedWorklogId=497418&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497418
 ]

ASF GitHub Bot logged work on HADOOP-17293:
---

Author: ASF GitHub Bot
Created on: 08/Oct/20 15:50
Start Date: 08/Oct/20 15:50
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2361:
URL: https://github.com/apache/hadoop/pull/2361#issuecomment-705660591


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  9s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 5 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m  4s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 39s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |   0m 32s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 25s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 38s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m 42s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 20s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m  4s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   1m  2s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 33s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 33s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |   0m 33s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |   0m 27s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 18s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  15m 50s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 16s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 22s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   1m  4s |  |  the patch passed  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 37s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 28s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  77m 50s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2361/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2361 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 29813eb6dfb6 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 52db86b0bb4 |
   | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2361/3/testReport/ |
   | Max. process+thread count | 340 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2361/3/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https:

[jira] [Work logged] (HADOOP-17293) S3A to always probe S3 in S3A getFileStatus on non-auth paths

2020-10-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17293?focusedWorklogId=497400&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497400
 ]

ASF GitHub Bot logged work on HADOOP-17293:
---

Author: ASF GitHub Bot
Created on: 08/Oct/20 15:11
Start Date: 08/Oct/20 15:11
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2361:
URL: https://github.com/apache/hadoop/pull/2361#issuecomment-705636416


   patch merged to 3.3.x and trunk, thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 497400)
Time Spent: 2.5h  (was: 2h 20m)

>  S3A to always probe S3 in S3A getFileStatus on non-auth paths
> --
>
> Key: HADOOP-17293
> URL: https://issues.apache.org/jira/browse/HADOOP-17293
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> an incidental part of HDP-13230 was a fix to innerGetFileStatus, wherein 
> after a HEAD request we would update the DDB record, so resetting it's TTL
> Applications which did remote updates of buckets without going through 
> s3guard are now triggering failures in applications in the cluster when they 
> go to open the file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17293) S3A to always probe S3 in S3A getFileStatus on non-auth paths

2020-10-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17293?focusedWorklogId=497398&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497398
 ]

ASF GitHub Bot logged work on HADOOP-17293:
---

Author: ASF GitHub Bot
Created on: 08/Oct/20 15:11
Start Date: 08/Oct/20 15:11
Worklog Time Spent: 10m 
  Work Description: steveloughran closed pull request #2361:
URL: https://github.com/apache/hadoop/pull/2361


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 497398)
Time Spent: 2h 20m  (was: 2h 10m)

>  S3A to always probe S3 in S3A getFileStatus on non-auth paths
> --
>
> Key: HADOOP-17293
> URL: https://issues.apache.org/jira/browse/HADOOP-17293
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> an incidental part of HDP-13230 was a fix to innerGetFileStatus, wherein 
> after a HEAD request we would update the DDB record, so resetting it's TTL
> Applications which did remote updates of buckets without going through 
> s3guard are now triggering failures in applications in the cluster when they 
> go to open the file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17293) S3A to always probe S3 in S3A getFileStatus on non-auth paths

2020-10-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17293?focusedWorklogId=497397&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497397
 ]

ASF GitHub Bot logged work on HADOOP-17293:
---

Author: ASF GitHub Bot
Created on: 08/Oct/20 15:10
Start Date: 08/Oct/20 15:10
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on a change in pull request 
#2361:
URL: https://github.com/apache/hadoop/pull/2361#discussion_r501799545



##
File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##
@@ -2956,55 +2956,30 @@ S3AFileStatus innerGetFileStatus(final Path f,
 // a file has been found in a non-auth path and the caller has not said
 // they only care about directories
 LOG.debug("Metadata for {} found in the non-auth metastore.", path);
-// If the timestamp of the pm is close to "now", we don't need to
-// bother with a check of S3. that means:
-// one of : status modtime is close to now,
-//  or pm.getLastUpdated() == now
-
-// get the time in which a status modtime is considered valid
-// in a non-auth metastore
-long validTime =
-ttlTimeProvider.getNow() - ttlTimeProvider.getMetadataTtl();
-final long msModTime = msStatus.getModificationTime();
-
-if (msModTime < validTime) {
-  LOG.debug("Metastore entry of {} is out of date, probing S3", path);
-  try {
-S3AFileStatus s3AFileStatus = s3GetFileStatus(path,
-key,
-probes,
-tombstones,
-needEmptyDirectoryFlag);
-// if the new status is more current than that in the metastore,
-// it means S3 has changed and the store needs updating
-final long s3ModTime = s3AFileStatus.getModificationTime();
-
-if (s3ModTime > msModTime) {
-  // there's new data in S3
-  LOG.debug("S3Guard metadata for {} is outdated;"
-  + " s3modtime={}; msModTime={} updating metastore",
-  path, s3ModTime, msModTime);
-  // add to S3Guard
-  S3Guard.putAndReturn(metadataStore, s3AFileStatus,
-  ttlTimeProvider);
-} else {
-  // the modtime of the data is the same as/older than the s3guard
-  // value either an old object has been found, or the existing one
-  // was retrieved in both cases -refresh the S3Guard entry so the
-  // record's TTL is updated.
-  S3Guard.refreshEntry(metadataStore, pm, s3AFileStatus,
-  ttlTimeProvider);
-}
-// return the value
-// note that the checks for empty dir status below can be skipped
-// because the call to s3GetFileStatus include the checks there
-return s3AFileStatus;
-  } catch (FileNotFoundException fne) {
-// the attempt to refresh the record failed because there was
-// no entry. Either it is a new file not visible, or it
-// has been deleted (and therefore S3Guard is out of sync with S3)
-LOG.warn("Failed to find file {}. Either it is not yet visible, or 
"
-+ "it has been deleted.", path);
+final long msModTime = pm.getFileStatus().getModificationTime();
+
+S3AFileStatus s3AFileStatus;
+try {
+  s3AFileStatus = s3GetFileStatus(path,
+  key,
+  probes,
+  tombstones,
+  needEmptyDirectoryFlag);
+} catch (FileNotFoundException fne) {
+  s3AFileStatus = null;

Review comment:
   done. logging @ trace so you get the stack when you really, really want 
it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 497397)
Time Spent: 2h 10m  (was: 2h)

>  S3A to always probe S3 in S3A getFileStatus on non-auth paths
> --
>
> Key: HADOOP-17293
> URL: https://issues.apache.org/jira/browse/HADOOP-17293
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> an incidental part of HDP-

[jira] [Work logged] (HADOOP-17293) S3A to always probe S3 in S3A getFileStatus on non-auth paths

2020-10-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17293?focusedWorklogId=497382&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497382
 ]

ASF GitHub Bot logged work on HADOOP-17293:
---

Author: ASF GitHub Bot
Created on: 08/Oct/20 14:44
Start Date: 08/Oct/20 14:44
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2361:
URL: https://github.com/apache/hadoop/pull/2361#issuecomment-705619305


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m  0s |  |  Docker mode activated.  |
   | -1 :x: |  patch  |   0m  4s |  |  
https://github.com/apache/hadoop/pull/2361 does not apply to trunk. Rebase 
required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for 
help.  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/2361 |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2361/4/console |
   | versions | git=2.17.1 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 497382)
Time Spent: 2h  (was: 1h 50m)

>  S3A to always probe S3 in S3A getFileStatus on non-auth paths
> --
>
> Key: HADOOP-17293
> URL: https://issues.apache.org/jira/browse/HADOOP-17293
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> an incidental part of HDP-13230 was a fix to innerGetFileStatus, wherein 
> after a HEAD request we would update the DDB record, so resetting it's TTL
> Applications which did remote updates of buckets without going through 
> s3guard are now triggering failures in applications in the cluster when they 
> go to open the file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org