gianm commented on PR #13027:
URL: https://github.com/apache/druid/pull/13027#issuecomment-1239058349
Hmm. IMO, we should definitely change something, since the behavior of
`FilenameUtils.wildcardMatch` is just really weird. For example, this returns
`true`:
```
FilenameUtils.wildcardMatch("a/b/c.txt", "a*.txt")
```
Which is weird since no real shell works this way. Generally it is expected
that `*` does not match `/`.
This patch fixes the weirdness to a globbing implementation where `*`
properly doesn't match `/`, and changing the examples to use `**.suffix` (which
_does_ match `/` in normal shells) instead of `*.suffix`.
However, there is another way to fix it that IMO is better. We could change
the `filter` glob to match file _names_ rather than _paths_. Then, `*.suffix`
would still work fine. It's closer to what the `local` input source does. It's
also close to what the `find` Unix command does when you do `find [directory]
-name [glob]`. (It searches in directory for files whose names match the
provided glob.)
I like this way better because it avoids the awkward `**` construction in
the examples, and avoids the need for people to think about entire paths in
their minds: they can simply think about the file names. (One reason to avoid
working with entire paths is that gets weird with cloud files. Like, in
`s3://a/b`, will the path-glob be applied to `s3://a/b`, or `/a/b/`, or `/b`,
or `b`? Better to dodge the question entirely by using names, i.e. apply it to
`b` alone.)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]