Hisoka-X commented on code in PR #7922:
URL: https://github.com/apache/seatunnel/pull/7922#discussion_r1818301294
##########
docs/en/connector-v2/source/LocalFile.md:
##########
@@ -254,6 +254,72 @@ Specifies Whether to process data using the tag attribute
format.
Filter pattern, which used for filtering files.
+The filtering format is similar to wildcard matching file names in Linux.
Review Comment:
We cannot tell users about the ambiguous conclusion. Please tell users
directly that we use Java regular expressions.
##########
docs/en/connector-v2/source/LocalFile.md:
##########
@@ -254,6 +254,72 @@ Specifies Whether to process data using the tag attribute
format.
Filter pattern, which used for filtering files.
+The filtering format is similar to wildcard matching file names in Linux.
+
+| Wildcard | Meaning
| Example
|
+|--------------|--------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|
+| * | Match 0 or more characters
| f*
    Any file starting with f<br/>b*.txt   Any file starting
with b, any character in the middle, and ending with. txt |
+| [] | Match a single character in parentheses
| [abc]*
  A file that starts with any one of the characters a, b, or c
|
+| ? | Match any single character
| f?.txt
  Any file starting with 'f' followed by a character and ending with '.
txt' |
+| [!] | Match any single character not in parentheses
| [!abc]*
  Any file that does not start with abc
|
+| [a-z] | Match any single character from a to z
| [a-z]*
  Any file starting with a to z
|
+| {a,b,c}/a..z | When separated by commas, it represents individual
characters<br/>When separated by two dots, represents continuous characters |
{a,b,c}*   Files starting with any character from abc<br/>{a..Z}*
   Files starting with any character from a to z |
+
+However, it should be noted that unlike Linux wildcard characters, when
encountering file suffixes, the middle dot cannot be omitted.
+
+For example, `abc20241022.csv`, the normal Linux wildcard `abc*` is
sufficient, but here we need to use `abc*.*` , Pay attention to a point in the
middle.
+
+File Structure Example:
+```
+report.txt
+notes.txt
+input.csv
+abch20241022.csv
+abcw20241022.csv
+abcx20241022.csv
+abcq20241022.csv
+abcg20241022.csv
+abcv20241022.csv
+abcb20241022.csv
+old_data.csv
+logo.png
+script.sh
+helpers.sh
Review Comment:
Please add some file path, not only match file name.
##########
docs/en/connector-v2/source/LocalFile.md:
##########
@@ -254,6 +254,72 @@ Specifies Whether to process data using the tag attribute
format.
Filter pattern, which used for filtering files.
+The filtering format is similar to wildcard matching file names in Linux.
+
+| Wildcard | Meaning
| Example
|
+|--------------|--------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|
+| * | Match 0 or more characters
| f*
    Any file starting with f<br/>b*.txt   Any file starting
with b, any character in the middle, and ending with. txt |
+| [] | Match a single character in parentheses
| [abc]*
  A file that starts with any one of the characters a, b, or c
|
+| ? | Match any single character
| f?.txt
  Any file starting with 'f' followed by a character and ending with '.
txt' |
+| [!] | Match any single character not in parentheses
| [!abc]*
  Any file that does not start with abc
|
+| [a-z] | Match any single character from a to z
| [a-z]*
  Any file starting with a to z
|
+| {a,b,c}/a..z | When separated by commas, it represents individual
characters<br/>When separated by two dots, represents continuous characters |
{a,b,c}*   Files starting with any character from abc<br/>{a..Z}*
   Files starting with any character from a to z |
+
+However, it should be noted that unlike Linux wildcard characters, when
encountering file suffixes, the middle dot cannot be omitted.
+
+For example, `abc20241022.csv`, the normal Linux wildcard `abc*` is
sufficient, but here we need to use `abc*.*` , Pay attention to a point in the
middle.
Review Comment:
Please replace this part to link to
https://en.wikipedia.org/wiki/Regular_expression. Let user to learn regular
itself.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]