sascha-coenen opened a new issue #9411: S3 InputSource issue when using prefix 
mode if a directory contains  _SUCCESS files
URL: https://github.com/apache/druid/issues/9411
 
 
   ### Affected Version
   v 0.17.0
   
   ### Description
   We set up Druid Indexer nodes to test the new native parallel ingestion. 
   Then we used the following InputSource section within an index_parallel spec 
to point to a "directory" in S3 that would contain a _SUCCESS file along with a 
bunch of data files.
   
   `
         "inputSource": {
           "type": "s3",
           "prefixes": 
["s3://smt-druid-ingestion-stage/SI-835/year=2020/month=01/day=20/hour=00/1580297687716/auction"]
         }
   `
   
   The index_parallel task fails and we observed in the logs that the above 
section got rewritten to the following
   
   `
         "inputSource": {
           "type": "s3",
           "uris": null,
           "prefixes": null,
           "objects": [
             {
               "bucket": "smt-druid-ingestion-stage",
               "path": 
"SI-835/year=2020/month=01/day=20/hour=00/1580297687716/auction/_SUCCESS"
             }
           ]
         }
   `
   
   This looks to me like an attempt was made to support filtering out _SUCCESS 
files from the file list and that inadvertently the filter condition is doing 
the opposite.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to