trunckman opened a new issue, #27427: URL: https://github.com/apache/doris/issues/27427
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues. ### Version 1.2.2-rc01 ### What's Wrong? 当spark load进行到loading阶段时,be直接宕机,并出现如下异常: 2023-11-22 19:12:35 I1122 19:12:35.430696 591 daemon.cpp:256] OS physical memory 5.79 GB. Process memory usage 411.33 MB, limit 4.63 GB, soft limit 4.17 GB. Sys available memory 4.64 GB, low water mark 593.28 MB, warning water mark 1.16 GB. Refresh interval memory growth 0 B 2023-11-22 19:12:37 I1122 19:12:37.808765 1226 tablet_manager.cpp:899] find expired transactions for 0 tablets 2023-11-22 19:12:37 I1122 19:12:37.809018 1226 tablet_manager.cpp:937] success to build all report tablets info. tablet_count=1 2023-11-22 19:12:37 I1122 19:12:37.809084 1225 data_dir.cpp:734] path: /opt/apache-doris/be/storage total capacity: 245107195904, available capacity: 20833218560 2023-11-22 19:12:37 I1122 19:12:37.809141 1225 storage_engine.cpp:367] get root path info cost: 0 ms. tablet counter: 1 2023-11-22 19:12:37 I1122 19:12:37.814081 1224 task_worker_pool.cpp:1519] successfully report TASK|host=10.193.235.56|port=9020 2023-11-22 19:12:37 I1122 19:12:37.816437 1226 task_worker_pool.cpp:1519] successfully report TABLET|host=10.193.235.56|port=9020 2023-11-22 19:12:37 I1122 19:12:37.819788 1225 task_worker_pool.cpp:1519] successfully report DISK|host=10.193.235.56|port=9020 2023-11-22 19:12:37 I1122 19:12:37.826227 1372 task_worker_pool.cpp:252] successfully submit task|type=REALTIME_PUSH|signature=10012|queue_size=1 2023-11-22 19:12:37 I1122 19:12:37.826923 1184 task_worker_pool.cpp:619] get push task. signature=10012, priority=NORMAL push_type=LOAD_V2 2023-11-22 19:12:37 I1122 19:12:37.827843 1184 engine_batch_load_task.cpp:253] begin to process push. transaction_id=10011 tablet_id=14007, version=-1 2023-11-22 19:12:37 I1122 19:12:37.827883 1184 push_handler.cpp:55] begin to realtime push. tablet=14007.1531981042.5742ffccb4a3bd8a-55e35e96369a1a99, transaction_id=10011 2023-11-22 19:12:37 I1122 19:12:37.829012 1184 push_handler.cpp:211] tablet=14007.1531981042.5742ffccb4a3bd8a-55e35e96369a1a99, file path=hdfs://10.78.2.133:8020/tmp/doris1/jobs/11001/spark_load_test86/24015/V1.spark_load_test86.14005.14004.14006.0.1531981042.parquet, file size=896 2023-11-22 19:12:37 *** Query id: 0-0 *** 2023-11-22 19:12:37 *** Aborted at 1700651557 (unix time) try "date -d @1700651557" if you are using GNU date *** 2023-11-22 19:12:37 *** Current BE git commitID: a3521b366 *** 2023-11-22 19:12:37 *** SIGSEGV address not mapped to object (@0x20) received by PID 584 (TID 0xfffccfe16b40) from PID 32; stack trace: *** 2023-11-22 19:12:38 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:420 2023-11-22 19:12:38 1# 0x0000FFFF997367A0 in linux-vdso.so.1 2023-11-22 19:12:38 2# doris::PushHandler::_convert_v2(std::shared_ptr<doris::Tablet>, std::shared_ptr<doris::Rowset>*, std::shared_ptr<doris::TabletSchema>) at /root/doris/be/src/olap/push_handler.cpp:247 2023-11-22 19:12:38 3# doris::PushHandler::_do_streaming_ingestion(std::shared_ptr<doris::Tablet>, doris::TPushReq const&, doris::PushType, std::vector<doris::TTabletInfo, std::allocator<doris::TTabletInfo> >*) at /root/doris/be/src/olap/push_handler.cpp:147 2023-11-22 19:12:38 4# doris::PushHandler::process_streaming_ingestion(std::shared_ptr<doris::Tablet>, doris::TPushReq const&, doris::PushType, std::vector<doris::TTabletInfo, std::allocator<doris::TTabletInfo> >*) at /root/doris/be/src/olap/push_handler.cpp:63 2023-11-22 19:12:38 5# doris::EngineBatchLoadTask::_push(doris::TPushReq const&, std::vector<doris::TTabletInfo, std::allocator<doris::TTabletInfo> >*) at /root/doris/be/src/olap/task/engine_batch_load_task.cpp:281 2023-11-22 19:12:38 6# doris::EngineBatchLoadTask::_process() at /root/doris/be/src/olap/task/engine_batch_load_task.cpp:232 2023-11-22 19:12:38 7# doris::EngineBatchLoadTask::execute() at /root/doris/be/src/olap/task/engine_batch_load_task.cpp:66 2023-11-22 19:12:38 8# doris::StorageEngine::execute_task(doris::EngineTask*) at /root/doris/be/src/olap/storage_engine.cpp:1022 2023-11-22 19:12:38 9# doris::TaskWorkerPool::_push_worker_thread_callback() at /root/doris/be/src/agent/task_worker_pool.cpp:627 2023-11-22 19:12:38 10# doris::ThreadPool::dispatch_thread() at /root/doris/be/src/util/threadpool.cpp:542 2023-11-22 19:12:38 11# doris::Thread::supervise_thread(void*) at /root/doris/be/src/util/thread.cpp:455 2023-11-22 19:12:38 12# start_thread in /lib/aarch64-linux-gnu/libpthread.so.0 2023-11-22 19:12:38 13# 0x0000FFFF9958001C in /lib/aarch64-linux-gnu/libc.so.6 2023-11-22 19:12:38 2023-11-22 19:12:38 /opt/apache-doris/be/bin/start_be.sh: line 244: 584 Segmentation fault ${LIMIT:+${LIMIT}} "${DORIS_HOME}/lib/doris_be" "$@" 2>&1 < /dev/null 2023-11-22 19:12:38 finished ### What You Expected? 我想知道为什么会出现这种情况,且如何解决 ### How to Reproduce? 1.建表语句 CREATE TABLE `load_obs_file_test` ( `id` int(11) NULL, `name` varchar(50) NULL, `age` tinyint(4) NULL ) ENGINE = OLAP UNIQUE KEY(`id`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`id`) BUCKETS 1 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2", "disable_auto_compaction" = "false" ); 2.load 语句 LOAD LABEL spark_load_test85 ( DATA INFILE("hdfs://xx.xx.xx.xx:8020/tmp/test/test.csv") INTO TABLE `load_obs_file_test` COLUMNS TERMINATED BY "," FORMAT AS "csv" (id,name,age) set ( id=id, name=name, age=age ) ) WITH RESOURCE 'spark2' ( "spark.executor.memory" = "1g", "spark.shuffle.compress" = "true" ) PROPERTIES ( "timeout" = "3600", "max_filter_ratio"="1" ); ### Anything Else? 1.spark load成功执行到loading阶段 2.成功生产parquet文件 3.parquet文件如下: parquet-tools show V1.spark_load_test85.24165.24164.24166.0.544377767.parquet +------+----------+-------+ | id | name | age | |------+----------+-------| | 1 | zhangsan | 14 | | 2 | lisi | 19 | +------+----------+-------+ parquet-tools inspect V1.spark_load_test85.24165.24164.24166.0.544377767.parquet ############ file meta data ############ created_by: parquet-mr version 1.12.0 (build db75a6815f2ba1d1ee89d1a90aeb296f1f3a8f20) num_columns: 3 num_rows: 2 num_row_groups: 1 format_version: 1.0 serialized_size: 651 ############ Columns ############ id name age ############ Column(id) ############ name: id path: id max_definition_level: 1 max_repetition_level: 0 physical_type: INT32 logical_type: None converted_type (legacy): NONE compression: SNAPPY (space_saved: -5%) ############ Column(name) ############ name: name path: name max_definition_level: 1 max_repetition_level: 0 physical_type: BYTE_ARRAY logical_type: String converted_type (legacy): UTF8 compression: SNAPPY (space_saved: -4%) ############ Column(age) ############ name: age path: age max_definition_level: 1 max_repetition_level: 0 physical_type: INT32 logical_type: Int(bitWidth=8, isSigned=true) converted_type (legacy): INT_8 compression: SNAPPY (space_saved: -5%) ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
