liaorui opened a new pull request, #7585: URL: https://github.com/apache/inlong/pull/7585
### Prepare a Pull Request *(Change the title refer to the following example)* - Title Example: [INLONG-XYZ][Component] Title of the pull request *(The following *XYZ* should be replaced by the actual [GitHub Issue](https://github.com/apache/inlong/issues) number)* - Fixes #7584 ### Motivation *Explain here the context, and why you're making that change. What is the problem you're trying to solve?* 1、When migrating all tables in database, doris server prefers CSV format than JSON format for lower CPU loads. 2、Dirty data could be archived to storage like S3. It is useful when data accounting and auditing. ### Modifications *Describe the modifications you've done.* 1、`serialize` method in `DorisDynamicSchemaOutputFormat` class can serialize CSV or JSON data by `DorisDynamicTableFactory` option. 2、`physicalData`, which is a Map, carries `__DIRTY_LOG_TAG__`、`__DIRTY_LABEL__` and `__DIRTY_IDENTIFIER__` flags. They will be used by the dirty helper to archive dirty data when doris server throws an exception. Doris server will ignore these flags as they are not real fields of doris table. 3、Only when `sink.multiple.schema-update.policy` is setted with `LOG_WITH_IGNORE`, dirty data is archived to s3 or file. Otherwise, only upload dirty data metrics, not archive data to files. 4、cdc-base module provides dirty helper and metric tool for doris connector. ### Verifying this change *(Please pick either of the following options)* - [ ] This change is a trivial rework/code cleanup without any test coverage. - [ ] This change is already covered by existing tests, such as: *(please describe tests)* - [ ] This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end deployment with large payloads (10MB)* - *Extended integration test for recovery after broker failure* ### Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) - If a feature is not applicable for documentation, explain why? - If a feature is not documented yet in this PR, please create a follow-up issue for adding the documentation -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
