sascha-coenen opened a new issue #9411: S3 InputSource issue when using prefix mode if a directory contains _SUCCESS files URL: https://github.com/apache/druid/issues/9411 ### Affected Version v 0.17.0 ### Description We set up Druid Indexer nodes to test the new native parallel ingestion. Then we used the following InputSource section within an index_parallel spec to point to a "directory" in S3 that would contain a _SUCCESS file along with a bunch of data files. ` "inputSource": { "type": "s3", "prefixes": ["s3://smt-druid-ingestion-stage/SI-835/year=2020/month=01/day=20/hour=00/1580297687716/auction"] } ` The index_parallel task fails and we observed in the logs that the above section got rewritten to the following ` "inputSource": { "type": "s3", "uris": null, "prefixes": null, "objects": [ { "bucket": "smt-druid-ingestion-stage", "path": "SI-835/year=2020/month=01/day=20/hour=00/1580297687716/auction/_SUCCESS" } ] } ` This looks to me like an attempt was made to support filtering out _SUCCESS files from the file list and that inadvertently the filter condition is doing the opposite.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
