Isaacwhyuenac opened a new pull request #15609:
URL: https://github.com/apache/airflow/pull/15609


   The changes improves the process of finding the S3 files and gives developer 
the freedom to list what they want. 
   
   For example, suppose in s3 bucket there are
   
   ```
   2021-04-30 10:38:27   11.7 KiB 
agg/user_segmentation/tag_article_user_analysis/partition_0=2021-04-29/20210430_023708_00016_g2f4v_31a98366-e1a0-428e-a57f-3c4b0fe2bb79
   2021-04-30 10:38:27   49.1 KiB 
agg/user_segmentation/tag_article_user_analysis/partition_0=2021-04-29/20210430_023708_00016_g2f4v_55438602-3787-4c09-9fad-562e7a6786cb
   2021-04-30 10:38:31   10.6 KiB 
agg/user_segmentation/tag_article_user_analysis/partition_0=2021-04-29/20210430_023708_00016_g2f4v_6773215f-1697-4c99-9e94-f7961e86af62
   2021-04-30 10:38:31   27.1 KiB 
agg/user_segmentation/tag_article_user_analysis/partition_0=2021-04-29/20210430_023708_00016_g2f4v_69c952f5-97b4-45e9-b790-fc7830fb2150
   2021-04-30 10:38:31  131.2 KiB 
agg/user_segmentation/tag_article_user_analysis/partition_0=2021-04-29/20210430_023708_00016_g2f4v_b4b995f5-211d-4d46-bd9a-86912b29d978
   2021-04-30 10:38:27  166.2 KiB 
agg/user_segmentation/tag_article_user_analysis/partition_0=2021-04-29/20210430_023708_00016_g2f4v_bbcebd80-c280-4e66-9431-9a626df8bc33
   2021-04-30 10:38:30  171.6 KiB 
agg/user_segmentation/tag_article_user_analysis/partition_0=2021-04-29/20210430_023708_00016_g2f4v_f4ef423f-cf70-4f71-960e-70f1bdddaf3d
   
   2021-04-30 10:38:27   11.7 KiB 
agg/user_segmentation/tag_article_user_analysis_v2/partition_0=2021-04-29/20210430_023708_00016_g2f4v_31a98366-e1a0-428e-a57f-3c4b0fe2bb79
   2021-04-30 10:38:27   49.1 KiB 
agg/user_segmentation/tag_article_user_analysis_v2/partition_0=2021-04-29/20210430_023708_00016_g2f4v_55438602-3787-4c09-9fad-562e7a6786cb
   2021-04-30 10:38:31   10.6 KiB 
agg/user_segmentation/tag_article_user_analysis_v2/partition_0=2021-04-29/20210430_023708_00016_g2f4v_6773215f-1697-4c99-9e94-f7961e86af62
   2021-04-30 10:38:31   27.1 KiB 
agg/user_segmentation/tag_article_user_analysis_v2/partition_0=2021-04-29/20210430_023708_00016_g2f4v_69c952f5-97b4-45e9-b790-fc7830fb2150
   2021-04-30 10:38:31  131.2 KiB 
agg/user_segmentation/tag_article_user_analysis_v2/partition_0=2021-04-29/20210430_023708_00016_g2f4v_b4b995f5-211d-4d46-bd9a-86912b29d978
   2021-04-30 10:38:27  166.2 KiB 
agg/user_segmentation/tag_article_user_analysis_v2/partition_0=2021-04-29/20210430_023708_00016_g2f4v_bbcebd80-c280-4e66-9431-9a626df8bc33
   2021-04-30 10:38:30  171.6 KiB 
agg/user_segmentation/tag_article_user_analysis_v2/partition_0=2021-04-29/20210430_023708_00016_g2f4v_f4ef423f-cf70-4f71-960e-70f1bdddaf3d
   ```
   
   Under the current schema, these two patterns are matched under 
`agg/user_segmentation/tag_article_user_analysis/` as the trailing slash is 
stripped of. Developer should have the freedom to choose what pattern they want 
to match instead of forcing a pattern matching for them.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to