[ 
https://issues.apache.org/jira/browse/HUDI-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398849#comment-17398849
 ] 

ASF GitHub Bot commented on HUDI-2268:
--------------------------------------

nsivabalan commented on a change in pull request #3470:
URL: https://github.com/apache/hudi/pull/3470#discussion_r688700887



##########
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/upgrade/TwoToOneDowngradeHandler.java
##########
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.table.upgrade;
+
+import org.apache.hudi.common.engine.HoodieEngineContext;
+import org.apache.hudi.common.fs.FSUtils;
+import org.apache.hudi.common.table.HoodieTableConfig;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+import org.apache.hudi.common.table.timeline.HoodieTimeline;
+import org.apache.hudi.common.util.MarkerUtils;
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.exception.HoodieException;
+import org.apache.hudi.exception.HoodieIOException;
+import org.apache.hudi.table.HoodieSparkTable;
+import org.apache.hudi.table.marker.DirectWriteMarkers;
+import org.apache.hudi.table.marker.MarkerType;
+
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+import java.util.Properties;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Downgrade handle to assist in downgrading hoodie table from version 2 to 1.
+ */
+public class TwoToOneDowngradeHandler implements DowngradeHandler {
+
+  @Override
+  public void downgrade(HoodieWriteConfig config, HoodieEngineContext context, 
String instantTime) {
+    HoodieSparkTable table = HoodieSparkTable.create(config, context);
+    HoodieTableMetaClient metaClient = table.getMetaClient();
+    Properties properties = metaClient.getTableConfig().getProps();
+    
properties.remove(HoodieTableConfig.HOODIE_TABLE_PARTITION_FIELDS_PROP.key());
+    properties.remove(HoodieTableConfig.HOODIE_TABLE_RECORDKEY_FIELDS.key());
+    properties.remove(HoodieTableConfig.HOODIE_BASE_FILE_FORMAT_PROP.key());
+
+    // Serializes the updated properties
+    // Since the metaclient object is not shared between 
AbstractUpgradeDownGrade,
+    // we have to serialize the properties here
+    Path propertyFile = new Path(metaClient.getMetaPath() + "/" + 
HoodieTableConfig.HOODIE_PROPERTIES_FILE);
+    try (FSDataOutputStream os = metaClient.getFs().create(propertyFile)) {
+      properties.store(os, "");
+    } catch (IOException e) {
+      throw new HoodieIOException("Updating hoodie.properties file to remove 
some of the new props failed ", e);
+    }
+
+    // re-create marker files if any partial timeline server based markers are 
found
+    HoodieTimeline inflightTimeline = 
metaClient.getCommitsTimeline().filterPendingExcludingCompaction();
+    List<HoodieInstant> commits = 
inflightTimeline.getReverseOrderedInstants().collect(Collectors.toList());
+    for (HoodieInstant commitInstant : commits) {
+      // Converts the markers in new format to old format of direct markers
+      convertToDirectMarkers(
+          commitInstant.getTimestamp(), table, context, 
config.getMarkersDeleteParallelism());
+    }
+  }
+
+  /**
+   * Converts the markers in new format(timeline server based) to old format 
of direct markers,
+   * i.e., one marker file per data file, without MARKERS.type file.
+   *
+   * @param commitInstantTime instant of interest for marker conversion.
+   * @param table             instance of {@link HoodieSparkTable} to use
+   * @param context           instance of {@link HoodieEngineContext} to use
+   * @param parallelism       parallelism to use
+   */
+  private void convertToDirectMarkers(final String commitInstantTime,
+                                      HoodieSparkTable table,
+                                      HoodieEngineContext context,
+                                      int parallelism) {
+    String markerDir = 
table.getMetaClient().getMarkerFolderPath(commitInstantTime);
+    FileSystem fileSystem = FSUtils.getFs(markerDir, 
context.getHadoopConf().newCopy());
+    Option<String> markerTypeOption = MarkerUtils.readMarkerType(fileSystem, 
markerDir);
+    if (markerTypeOption.isPresent()) {

Review comment:
       There is some work here to ensure idempotency incase of failures. lets 
sync up sometime. I can go over the details. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


> Upgrade hoodie table to 0.9.0
> -----------------------------
>
>                 Key: HUDI-2268
>                 URL: https://issues.apache.org/jira/browse/HUDI-2268
>             Project: Apache Hudi
>          Issue Type: Sub-task
>          Components: Usability
>            Reporter: sivabalan narayanan
>            Assignee: sivabalan narayanan
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.9.0
>
>
> Wrt upgrading/downgrading hoodie.properties, here is what we can go. 
> Add a new table version, 2. 
> Add an upgrade step:
> before every write operation. 
>      Check if existing hoodie.props is in an older version. If yes, perform 
> upgrade step to version2 (either from 0 to 2 or from 1 to 2). This 
> essentially means that we need to add new properties pertaining to sql dml to 
> hoodie.properties. 
> Things to watch out for:
> for some operations, not all props might be set by the user. So, we might 
> need to throw an exception. (record key field, partition path field, key gen 
> prop, precombine field). 
> We need to fetch latest table schema since the incoming df could have partial 
> cols.
>  
> Downgrade step: 
> hoodie.properties will have some additional properties. Should not cause any 
> harm. All we need to do is to downgrade the table version to target version 
> and not touch any of the props. 
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to