yangzhg opened a new issue #4715: URL: https://github.com/apache/incubator-doris/issues/4715
When using multiload to realize the atomic import of multi-table data, although the interface is basically the same as the stream load, the function is not fully supported. In the process of importing traditional data into doris, it needs to support multi-table transaction support. The original multiLoad implementation needs to be improved once. And change the actual import plan using streaming. The design plan uses the original api interface, the data is still downloaded and temporarily stored on the be, fe still stores the imported meta information, but the new plan is used in the commit phase, and the streaming import is not used directly through the etl process to execute the plan Refer to broker load to generate an execution plan similar to broker load. The data reading is changed from http of broker load to reading local files, and the rest is basically the same as broker load. The basic process is as follows: * `_multi_start` Start import transaction to create txn * `_load` FE records imported meta-information, generates and saves data similar to BrokerFileGroup, and BE downloads and temporarily stores the data * `_multi_commit` generates an execution plan on the FE side. The generation process refers to Broker load, uses streaming import, and sends it to be for execution, waiting for the execution to complete. API returns * `_multi_abort` and _multi_desc remain the same as before The parameters used in _load are the same as before, and the Header parameters are the same as steamload It should be noted here that unlike broker load, the final plan generated by multiload can only be executed sequentially on the same node. This is mainly to ensure the order of imported files ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
