[ 
https://issues.apache.org/jira/browse/HADOOP-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035217#comment-16035217
 ] 

Steve Loughran commented on HADOOP-13998:
-----------------------------------------

regarding tests, I'm seeing something up with the combination of (s3guard and 
the partition committer (and only it)): a newly created file is where it should 
be, but the parent dir is still tagged as missing. I  can GET the file, but if 
I try to list the parent I get rejected:
{code}
2017-06-02 18:19:10,709 [ScalaTest-main-running-S3ACommitDataframeSuite] INFO  
s3.S3AOperations (Logging.scala:logInfo(54)) - 
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/part-00000-7573c876-38e5-4024-8a53-51fa1aa9c9c2-c000.snappy.orc
 size=384
2017-06-02 18:19:10,709 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG 
s3a.S3AFileSystem (S3AFileSystem.java:innerGetFileStatus(1899)) - Getting path 
status for 
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS
  
(cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS)
2017-06-02 18:19:10,710 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG 
s3guard.MetadataStore (LocalMetadataStore.java:get(151)) - 
get(s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS)
 -> file  
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS
 3400    UNKNOWN  false 
S3AFileStatus{path=s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS;
 isDirectory=false; length=3400; replication=1; blocksize=1048576; 
modification_time=1496423948811; access_time=0; owner=stevel; group=stevel; 
permission=rw-rw-rw-; isSymlink=false; hasAcl=false; isEncrypted=false; 
isErasureCoded=false} isEmptyDirectory=FALSE
2017-06-02 18:19:10,710 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG 
s3a.S3AFileSystem (S3AFileSystem.java:innerListStatus(1660)) - List status for 
path: 
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc
2017-06-02 18:19:10,710 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG 
s3a.S3AFileSystem (S3AFileSystem.java:innerGetFileStatus(1899)) - Getting path 
status for 
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc
  
(cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc)
2017-06-02 18:19:10,711 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG 
s3guard.MetadataStore (LocalMetadataStore.java:get(151)) - 
get(s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc)
 -> file  
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc
 0       UNKNOWN  true  
FileStatus{path=s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc;
 isDirectory=false; length=0; replication=0; blocksize=0; 
modification_time=1496423936532; access_time=0; owner=; group=; 
permission=rw-rw-rw-; isSymlink=false; hasAcl=false; isEncrypted=false; 
isErasureCoded=false}
2017-06-02 18:19:10,719 [dispatcher-event-loop-6] INFO  
spark.MapOutputTrackerMasterEndpoint (Logging.scala:logInfo(54)) - 
MapOutputTrackerMasterEndpoint stopped!
2017-06-02 18:19:10,727 [dispatcher-event-loop-3] INFO  
scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint 
(Logging.scala:logInfo(54)) - OutputCommitCoordinator stopped!
2017-06-02 18:19:10,729 [ScalaTest-main-running-S3ACommitDataframeSuite] INFO  
spark.SparkContext (Logging.scala:logInfo(54)) - Successfully stopped 
SparkContext
- Dataframe+partitioned *** FAILED ***
  java.io.FileNotFoundException: Path 
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc
 is recorded as deleted by S3Guard
  at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:1906)
  at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1881)
  at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:1664)
  at org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:1640)
  at 
com.hortonworks.spark.cloud.ObjectStoreOperations$class.validateRowCount(ObjectStoreOperations.scala:340)
  at 
com.hortonworks.spark.cloud.CloudSuite.validateRowCount(CloudSuite.scala:37)
  at 
com.hortonworks.spark.cloud.s3.commit.S3ACommitDataframeSuite.testOneFormat(S3ACommitDataframeSuite.scala:107)
  at 
com.hortonworks.spark.cloud.s3.commit.S3ACommitDataframeSuite$$anonfun$1$$anonfun$apply$2.apply$mcV$sp(S3ACommitDataframeSuite.scala:71)
  at 
com.hortonworks.spark.cloud.CloudSuiteTrait$$anonfun$ctest$1.apply$mcV$sp(CloudSuiteTrait.scala:66)
  at 
com.hortonworks.spark.cloud.CloudSuiteTrait$$anonfun$ctest$1.apply(CloudSuiteTrait.scala:64)
{code}
I don't know where the blame lies here, but its something I'd like to 
understand first. IT does not happen when s3guard is off; there the new 
committer works

> initial s3guard preview
> -----------------------
>
>                 Key: HADOOP-13998
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13998
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Steve Loughran
>
> JIRA to link in all the things we think are needed for a preview/merge into 
> trunk



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to