gianm commented on PR #13027:
URL: https://github.com/apache/druid/pull/13027#issuecomment-1239058349

   Hmm. IMO, we should definitely change something, since the behavior of 
`FilenameUtils.wildcardMatch` is just really weird. For example, this returns 
`true`:
   
   ```
   FilenameUtils.wildcardMatch("a/b/c.txt", "a*.txt")
   ```
   
   Which is weird since no real shell works this way. Generally it is expected 
that `*` does not match `/`.
   
   This patch fixes the weirdness to a globbing implementation where `*` 
properly doesn't match `/`, and changing the examples to use `**.suffix` (which 
_does_ match `/` in normal shells) instead of `*.suffix`.
   
   However, there is another way to fix it that IMO is better. We could change 
the `filter` glob to match file _names_ rather than _paths_. Then, `*.suffix` 
would still work fine. It's closer to what the `local` input source does. It's 
also close to what the `find` Unix command does when you do `find [directory] 
-name [glob]`. (It searches in directory for files whose names match the 
provided glob.)
   
   I like this way better because it avoids the awkward `**` construction in 
the examples, and avoids the need for people to think about entire paths in 
their minds: they can simply think about the file names. (One reason to avoid 
working with entire paths is that gets weird with cloud files. Like, in 
`s3://a/b`, will the path-glob be applied to `s3://a/b`, or `/a/b/`, or `/b`, 
or `b`? Better to dodge the question entirely by using names, i.e. apply it to 
`b` alone.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to