xushiyan commented on a change in pull request #4083:
URL: https://github.com/apache/hudi/pull/4083#discussion_r788428757



##########
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/InputPathHandler.java
##########
@@ -53,25 +57,27 @@
   public static final Logger LOG = 
LogManager.getLogger(InputPathHandler.class);
 
   private final Configuration conf;
-  // tablename to metadata mapping for all Hoodie tables(both incremental & 
snapshot)
+  // tableName to metadata mapping for all Hoodie tables(both incremental & 
snapshot)
   private final Map<String, HoodieTableMetaClient> tableMetaClientMap;
   private final Map<HoodieTableMetaClient, List<Path>> groupedIncrementalPaths;
   private final List<Path> snapshotPaths;
   private final List<Path> nonHoodieInputPaths;
+  private boolean isIncrementalUseDatabase;
 
-  public InputPathHandler(Configuration conf, Path[] inputPaths, List<String> 
incrementalTables) throws IOException {
+  public InputPathHandler(Configuration conf, Path[] inputPaths, List<String> 
incrementalTables, JobConf job) throws IOException {
     this.conf = conf;
     tableMetaClientMap = new HashMap<>();
     snapshotPaths = new ArrayList<>();
     nonHoodieInputPaths = new ArrayList<>();
     groupedIncrementalPaths = new HashMap<>();
+    this.isIncrementalUseDatabase = 
HoodieHiveUtils.isIncrementalUseDatabase(Job.getInstance(job));

Review comment:
       `JobConf` is passed in just to compute a boolean config? can we somehow 
make the config extracted from `Configuration conf` ? always prefer to have 
less args

##########
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/InputPathHandler.java
##########
@@ -117,9 +124,11 @@ private void parseInputPaths(Path[] inputPaths, 
List<String> incrementalTables)
     }
   }
 
-  private void tagAsIncrementalOrSnapshot(Path inputPath, String tableName,
-      HoodieTableMetaClient metaClient, List<String> incrementalTables) {
-    if (!incrementalTables.contains(tableName)) {
+  private void tagAsIncrementalOrSnapshot(Path inputPath, 
HoodieTableMetaClient metaClient, List<String> incrementalTables) {
+    String databaseName = metaClient.getTableConfig().getDatabaseName();
+    String tableName = metaClient.getTableConfig().getTableName();
+    if ((isIncrementalUseDatabase && !StringUtils.isNullOrEmpty(databaseName) 
&& !incrementalTables.contains(databaseName + "." + tableName))
+            || (!(isIncrementalUseDatabase && 
!StringUtils.isNullOrEmpty(databaseName)) && 
!incrementalTables.contains(tableName))) {

Review comment:
       this condition check is pretty hard to read.. can we improve this?

##########
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/InputPathHandler.java
##########
@@ -95,19 +101,20 @@ private void parseInputPaths(Path[] inputPaths, 
List<String> incrementalTables)
           // We already know the base path for this inputPath.
           basePathKnown = true;
           // Check if this is for a snapshot query
-          String tableName = metaClient.getTableConfig().getTableName();
-          tagAsIncrementalOrSnapshot(inputPath, tableName, metaClient, 
incrementalTables);
+          tagAsIncrementalOrSnapshot(inputPath, metaClient, incrementalTables);
           break;
         }
       }
       if (!basePathKnown) {
-        // This path is for a table that we dont know about yet.
+        // This path is for a table that we don't know about yet.
         HoodieTableMetaClient metaClient;
         try {
           metaClient = 
getTableMetaClientForBasePath(inputPath.getFileSystem(conf), inputPath);
+          String databaseName = metaClient.getTableConfig().getDatabaseName();
           String tableName = metaClient.getTableConfig().getTableName();
-          tableMetaClientMap.put(tableName, metaClient);
-          tagAsIncrementalOrSnapshot(inputPath, tableName, metaClient, 
incrementalTables);
+          tableMetaClientMap.put(isIncrementalUseDatabase && 
!StringUtils.isNullOrEmpty(databaseName)
+                  ? databaseName + "." + tableName : tableName, metaClient);

Review comment:
       can we move the table name creation logic into a helper method?

##########
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java
##########
@@ -76,6 +76,11 @@
   public static final String HOODIE_PROPERTIES_FILE = "hoodie.properties";
   public static final String HOODIE_PROPERTIES_FILE_BACKUP = 
"hoodie.properties.backup";
 
+  public static final ConfigProperty<String> DATABASE_NAME = ConfigProperty
+      .key("hoodie.database.name")
+      .noDefaultValue()
+      .withDocumentation("Database name that will be used for incremental 
query.");

Review comment:
       this is not just for incremental query now, right? spark sql also uses 
it. Better to add more info here to explain the use cases of this config, which 
will show up in the website for users to understand.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to