[ 
https://issues.apache.org/jira/browse/RAT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16980887#comment-16980887
 ] 

Raphael von der Grün edited comment on RAT-265 at 11/23/19 9:37 PM:
--------------------------------------------------------------------

Thanks for looking into this. I'm on Ubuntu using bash. But I'm fairly certain 
that this is not a quoting issue.

However, I just noticed that I had an error in my reproduction command. The 
value of the `-d` option has to be a directory of course. I just updated it in 
the original post. The correct command is the following:
{noformat}
java -jar apache-rat-0.14-20191120.132901-66.jar -e "*.txt" -d 
apache-rat-core/src/test/resources/violations
{noformat}
the quoting of the `-e` option cannot be omitted. The reason why you do not see 
the warning with an unquoted exclude pattern is shell expansion:
{noformat}
$ echo *.txt
BUILD.txt README.txt RELEASE-NOTES.txt RELEASE_NOTES.txt
{noformat}
Since the shell expands the glob before passing it to RAT you don't get the 
warning from before.


was (Author: raphinesse):
Thanks for looking into this. I'm on Ubuntu using bash though. But I'm fairly 
certain that this is not a quoting issue.

However, I just noticed that I had an error in my reproduction command. The 
value of the `-d` option has to be a directory of course. I just updated it in 
the original post. The correct command is the following:
{noformat}
java -jar apache-rat-0.14-20191120.132901-66.jar -e "*.txt" -d 
apache-rat-core/src/test/resources/violations
{noformat}
the quoting of the `-e` option cannot be omitted. The reason why you do not see 
the warning with an unquoted exclude pattern is shell expansion:
{noformat}
$ echo *.txt
BUILD.txt README.txt RELEASE-NOTES.txt RELEASE_NOTES.txt
{noformat}
Since the shell expands the glob before passing it to RAT you don't get the 
warning from before.

> CLI: Certain wildcard file filters do not work anymore
> ------------------------------------------------------
>
>                 Key: RAT-265
>                 URL: https://issues.apache.org/jira/browse/RAT-265
>             Project: Apache Rat
>          Issue Type: Bug
>          Components: cli
>    Affects Versions: 0.13, 0.14
>            Reporter: Raphael von der Grün
>            Priority: Major
>
> Run the following command in the root of the `rat` repo:
> {noformat}
> java -jar apache-rat-0.14-20191120.132901-66.jar -e "*.txt" -d 
> apache-rat-core/src/test/resources/violations{noformat}
> This will give the following output on `stderr`: 
> {noformat}
> Will skip given exclusion '*.txt' due to 
> java.util.regex.PatternSyntaxException: Dangling meta character '*' near 
> index 0
> *.txt
> ^
> {noformat}
> Furthermore, `bad.txt` will NOT be excluded from the license check.
> The error that causes this is thrown in [line 132 of 
> `org.apache.rat.Report.java`|#L132]]. The reason is simple: any glob pattern 
> that starts with `*` or `?` is not a valid regex. When Line 132 throws, the 
> next two lines will also be skipped, so the pattern will not be added at all.
> Unfortunately, a solution to this problem is not so simple. In `v0.12` the 
> `-e` option always added wildcard filters while `-E` always added regex 
> filters. The documentation still states the same in the latest `v0.14` 
> snapshot. Beginning with `v0.13` the code tries to add any exclude rule as 
> three different filters. I believe this approach is inherently flawed.
> Firstly, the `new NameFileFilter(exclusion)` is redundant if we also add `new 
> WildcardFileFilter(exclusion)`. The files matched by the `NameFileFilter` are 
> a subset of those matched by the `WildcardFileFilter` since any magic 
> character (i.e. `?` or `*`) in `exclusion` also matches itself when used in a 
> `WildcardFileFilter`.
> So let's assume we only register the `WildcardFileFilter` and the 
> `RegexFileFilter`. Even if we properly add patterns as wildcard filters that 
> are not a valid RegEx, there are still patterns where we cannot decide what 
> the user's intention was. Consider the pattern `bi.ini`. Should it be 
> interpreted as a wildcard pattern and match only itself or should it be 
> interpreted as a regex and also match `bikini` for example?
> My recommendation for a quick patch solution would be to go back to the 
> exclusion behavior of `v0.12`.
> Beyond that, the nicest solution IMHO would be support for ignore files with 
> the same semantics as `.gitignore` (via `-E`) and support for giving extended 
> shell globs via `-e`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to