[
https://issues.apache.org/jira/browse/HUDI-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376616#comment-17376616
]
ASF GitHub Bot commented on HUDI-2087:
--------------------------------------
yanghua commented on a change in pull request #3174:
URL: https://github.com/apache/hudi/pull/3174#discussion_r665440709
##########
File path:
hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/BucketAssignFunction.java
##########
@@ -106,11 +108,18 @@
*/
private final boolean globalIndex;
+ private final boolean appendOnly;
+
public BucketAssignFunction(Configuration conf) {
this.conf = conf;
this.isChangingRecords = WriteOperationType.isChangingRecords(
WriteOperationType.fromValue(conf.getString(FlinkOptions.OPERATION)));
this.globalIndex = conf.getBoolean(FlinkOptions.INDEX_GLOBAL_ENABLED);
+ this.appendOnly = conf.getBoolean(FlinkOptions.APPEND_ONLY_ENABLE);
+ if (appendOnly) {
+
ValidationUtils.checkArgument(conf.getString(FlinkOptions.TABLE_TYPE).equals(HoodieTableType.COPY_ON_WRITE.name()),
+ "Append only only support in COPY_ON_WRITE.");
Review comment:
"APPEND_ONLY mode only support in COPY_ON_WRITE table." sounds better?
##########
File path:
hudi-flink/src/test/java/org/apache/hudi/sink/TestWriteMergeOnRead.java
##########
@@ -86,6 +88,12 @@ protected void checkWrittenData(File baseFile, Map<String,
String> expected, int
return EXPECTED1;
}
+ @Disabled
+ @Test
+ public void testAppendOnly() throws Exception {
+ // Ignore the append only because only support cow.
Review comment:
useless? or we can assert if we would throw an exception in MOR table?
##########
File path:
hudi-flink/src/main/java/org/apache/hudi/streamer/FlinkStreamerConfig.java
##########
@@ -69,6 +69,9 @@
@Parameter(names = {"--table-type"}, description = "Type of table.
COPY_ON_WRITE (or) MERGE_ON_READ.", required = true)
public String tableType;
+ @Parameter(names = {"--append-only"}, description = "Write data to new
parquet in every checkpoint. Only support in COPY_ON_WRITE.", required = true)
Review comment:
`Only support in COPY_ON_WRITE table.`
##########
File path:
hudi-flink/src/test/java/org/apache/hudi/sink/TestWriteMergeOnReadWithCompact.java
##########
@@ -45,6 +45,12 @@ public void testIndexStateBootstrap() {
// Ignore the index bootstrap because we only support parquet load now.
}
+ @Disabled
+ @Test
+ public void testAppendOnly() {
+ // Ignore the append only because only support cow.
Review comment:
ditto
##########
File path:
hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/BucketAssignFunction.java
##########
@@ -207,13 +226,18 @@ private HoodieRecordLocation getNewRecordLocation(String
partitionPath) {
final HoodieRecordLocation location;
switch (bucketInfo.getBucketType()) {
case INSERT:
Review comment:
Is there any code path that we may get `APPEND_ONLY` value from the
`bucketInfo.getBucketType()`?
##########
File path: hudi-flink/src/test/java/org/apache/hudi/utils/TestData.java
##########
@@ -381,6 +381,35 @@ public static void checkWrittenData(
}
}
+ public static void checkWrittenAllData(
+ File baseFile,
+ Map<String, List<String>> expected,
+ int partitions) throws IOException {
+ assert baseFile.isDirectory();
+ FileFilter filter = file -> !file.getName().startsWith(".");
+ File[] partitionDirs = baseFile.listFiles(filter);
+ assertNotNull(partitionDirs);
Review comment:
add some empty lines around the assert statements?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
> Support Append only in Flink stream
> -----------------------------------
>
> Key: HUDI-2087
> URL: https://issues.apache.org/jira/browse/HUDI-2087
> Project: Apache Hudi
> Issue Type: Improvement
> Components: Flink Integration
> Reporter: yuzhaojing
> Assignee: yuzhaojing
> Priority: Major
> Labels: pull-request-available
>
> It is necessary to support append mode in flink stream, as the data lake
> should be able to write log type data as parquet high performance without
> merge.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)