[
https://issues.apache.org/jira/browse/FLINK-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15335578#comment-15335578
]
ASF GitHub Bot commented on FLINK-3677:
---------------------------------------
Github user zentol commented on a diff in the pull request:
https://github.com/apache/flink/pull/2109#discussion_r67468955
--- Diff:
flink-core/src/main/java/org/apache/flink/api/common/io/FilesFilter.java ---
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.common.io;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.core.fs.Path;
+
+import java.io.Serializable;
+import java.nio.file.FileSystem;
+import java.nio.file.FileSystems;
+import java.nio.file.PathMatcher;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.List;
+
+/**
+ * Class for determining if a particular file should be included or
excluded.
+ *
+ * <p> If does not match an include pattern it is excluded. If it matches
and include
+ * pattern but also matches an exclude pattern it is excluded.
+ *
+ * <p> If no patterns are provided all files are included
+ */
+@Internal
+public class FilesFilter implements Serializable {
+
+ private static final long serialVersionUID = 1L;
+
+ private final List<PathMatcher> includeMatchers;
+ private final List<PathMatcher> excludeMatchers;
+
+ /**
+ * Constructor for FilesFilter
+ *
+ * @param includePatterns glob patterns for files to include
+ * @param excludePatterns glob patterns for files to exclude
+ */
+ public FilesFilter(String[] includePatterns, String[] excludePatterns) {
+ includeMatchers = buildPatterns(includePatterns);
+ excludeMatchers = buildPatterns(excludePatterns);
+ }
+
+ private List<PathMatcher> buildPatterns(String[] patterns) {
+ FileSystem fileSystem = FileSystems.getDefault();
+ List<PathMatcher> matchers = new ArrayList<>();
+
+ for (String patternStr : patterns) {
+ matchers.add(fileSystem.getPathMatcher("glob:" +
patternStr));
--- End diff --
will these matchers also work with files that reside in HDFS?
> FileInputFormat: Allow to specify include/exclude file name patterns
> --------------------------------------------------------------------
>
> Key: FLINK-3677
> URL: https://issues.apache.org/jira/browse/FLINK-3677
> Project: Flink
> Issue Type: Improvement
> Components: Core
> Affects Versions: 1.0.0
> Reporter: Maximilian Michels
> Assignee: Ivan Mushketyk
> Priority: Minor
> Labels: starter
>
> It would be nice to be able to specify a regular expression to filter files.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)