trunckman opened a new issue, #27427:
URL: https://github.com/apache/doris/issues/27427

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Version
   
   1.2.2-rc01
   
   ### What's Wrong?
   
   当spark load进行到loading阶段时,be直接宕机,并出现如下异常:
   
   2023-11-22 19:12:35 I1122 19:12:35.430696   591 daemon.cpp:256] OS physical 
memory 5.79 GB. Process memory usage 411.33 MB, limit 4.63 GB, soft limit 4.17 
GB. Sys available memory 4.64 GB, low water mark 593.28 MB, warning water mark 
1.16 GB. Refresh interval memory growth 0 B
   2023-11-22 19:12:37 I1122 19:12:37.808765  1226 tablet_manager.cpp:899] find 
expired transactions for 0 tablets
   2023-11-22 19:12:37 I1122 19:12:37.809018  1226 tablet_manager.cpp:937] 
success to build all report tablets info. tablet_count=1
   2023-11-22 19:12:37 I1122 19:12:37.809084  1225 data_dir.cpp:734] path: 
/opt/apache-doris/be/storage total capacity: 245107195904, available capacity: 
20833218560
   2023-11-22 19:12:37 I1122 19:12:37.809141  1225 storage_engine.cpp:367] get 
root path info cost: 0 ms. tablet counter: 1
   2023-11-22 19:12:37 I1122 19:12:37.814081  1224 task_worker_pool.cpp:1519] 
successfully report TASK|host=10.193.235.56|port=9020
   2023-11-22 19:12:37 I1122 19:12:37.816437  1226 task_worker_pool.cpp:1519] 
successfully report TABLET|host=10.193.235.56|port=9020
   2023-11-22 19:12:37 I1122 19:12:37.819788  1225 task_worker_pool.cpp:1519] 
successfully report DISK|host=10.193.235.56|port=9020
   2023-11-22 19:12:37 I1122 19:12:37.826227  1372 task_worker_pool.cpp:252] 
successfully submit task|type=REALTIME_PUSH|signature=10012|queue_size=1
   2023-11-22 19:12:37 I1122 19:12:37.826923  1184 task_worker_pool.cpp:619] 
get push task. signature=10012, priority=NORMAL push_type=LOAD_V2
   2023-11-22 19:12:37 I1122 19:12:37.827843  1184 
engine_batch_load_task.cpp:253] begin to process push.  transaction_id=10011 
tablet_id=14007, version=-1
   2023-11-22 19:12:37 I1122 19:12:37.827883  1184 push_handler.cpp:55] begin 
to realtime push. tablet=14007.1531981042.5742ffccb4a3bd8a-55e35e96369a1a99, 
transaction_id=10011
   2023-11-22 19:12:37 I1122 19:12:37.829012  1184 push_handler.cpp:211] 
tablet=14007.1531981042.5742ffccb4a3bd8a-55e35e96369a1a99, file 
path=hdfs://10.78.2.133:8020/tmp/doris1/jobs/11001/spark_load_test86/24015/V1.spark_load_test86.14005.14004.14006.0.1531981042.parquet,
 file size=896
   2023-11-22 19:12:37 *** Query id: 0-0 ***
   2023-11-22 19:12:37 *** Aborted at 1700651557 (unix time) try "date -d 
@1700651557" if you are using GNU date ***
   2023-11-22 19:12:37 *** Current BE git commitID: a3521b366 ***
   2023-11-22 19:12:37 *** SIGSEGV address not mapped to object (@0x20) 
received by PID 584 (TID 0xfffccfe16b40) from PID 32; stack trace: ***
   2023-11-22 19:12:38  0# doris::signal::(anonymous 
namespace)::FailureSignalHandler(int, siginfo_t*, void*) at 
/root/doris/be/src/common/signal_handler.h:420
   2023-11-22 19:12:38  1# 0x0000FFFF997367A0 in linux-vdso.so.1
   2023-11-22 19:12:38  2# 
doris::PushHandler::_convert_v2(std::shared_ptr<doris::Tablet>, 
std::shared_ptr<doris::Rowset>*, std::shared_ptr<doris::TabletSchema>) at 
/root/doris/be/src/olap/push_handler.cpp:247
   2023-11-22 19:12:38  3# 
doris::PushHandler::_do_streaming_ingestion(std::shared_ptr<doris::Tablet>, 
doris::TPushReq const&, doris::PushType, std::vector<doris::TTabletInfo, 
std::allocator<doris::TTabletInfo> >*) at 
/root/doris/be/src/olap/push_handler.cpp:147
   2023-11-22 19:12:38  4# 
doris::PushHandler::process_streaming_ingestion(std::shared_ptr<doris::Tablet>, 
doris::TPushReq const&, doris::PushType, std::vector<doris::TTabletInfo, 
std::allocator<doris::TTabletInfo> >*) at 
/root/doris/be/src/olap/push_handler.cpp:63
   2023-11-22 19:12:38  5# doris::EngineBatchLoadTask::_push(doris::TPushReq 
const&, std::vector<doris::TTabletInfo, std::allocator<doris::TTabletInfo> >*) 
at /root/doris/be/src/olap/task/engine_batch_load_task.cpp:281
   2023-11-22 19:12:38  6# doris::EngineBatchLoadTask::_process() at 
/root/doris/be/src/olap/task/engine_batch_load_task.cpp:232
   2023-11-22 19:12:38  7# doris::EngineBatchLoadTask::execute() at 
/root/doris/be/src/olap/task/engine_batch_load_task.cpp:66
   2023-11-22 19:12:38  8# 
doris::StorageEngine::execute_task(doris::EngineTask*) at 
/root/doris/be/src/olap/storage_engine.cpp:1022
   2023-11-22 19:12:38  9# 
doris::TaskWorkerPool::_push_worker_thread_callback() at 
/root/doris/be/src/agent/task_worker_pool.cpp:627
   2023-11-22 19:12:38 10# doris::ThreadPool::dispatch_thread() at 
/root/doris/be/src/util/threadpool.cpp:542
   2023-11-22 19:12:38 11# doris::Thread::supervise_thread(void*) at 
/root/doris/be/src/util/thread.cpp:455
   2023-11-22 19:12:38 12# start_thread in 
/lib/aarch64-linux-gnu/libpthread.so.0
   2023-11-22 19:12:38 13# 0x0000FFFF9958001C in 
/lib/aarch64-linux-gnu/libc.so.6
   2023-11-22 19:12:38 
   2023-11-22 19:12:38 /opt/apache-doris/be/bin/start_be.sh: line 244:   584 
Segmentation fault      ${LIMIT:+${LIMIT}} "${DORIS_HOME}/lib/doris_be" "$@" 
2>&1 < /dev/null
   2023-11-22 19:12:38 finished
   
   
   ### What You Expected?
   
   我想知道为什么会出现这种情况,且如何解决
   
   ### How to Reproduce?
   
   1.建表语句
   CREATE TABLE `load_obs_file_test` (
     `id` int(11) NULL, 
     `name` varchar(50) NULL, 
     `age` tinyint(4) NULL
   ) ENGINE = OLAP UNIQUE KEY(`id`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`id`) 
BUCKETS 1 PROPERTIES (
     "replication_allocation" = "tag.location.default: 1", 
     "in_memory" = "false", "storage_format" = "V2", 
     "disable_auto_compaction" = "false"
   );
   
   2.load 语句
   LOAD LABEL spark_load_test85
   (
       DATA INFILE("hdfs://xx.xx.xx.xx:8020/tmp/test/test.csv")
       INTO TABLE `load_obs_file_test`
       COLUMNS TERMINATED BY ","
       FORMAT AS "csv"
     (id,name,age) 
     set (
       id=id,
       name=name,
       age=age
     )
   )
   WITH RESOURCE 'spark2'
   (
       "spark.executor.memory" = "1g",
       "spark.shuffle.compress" = "true"
   )
   PROPERTIES
   (
       "timeout" = "3600",
       "max_filter_ratio"="1"
   );
   
   ### Anything Else?
   
   1.spark load成功执行到loading阶段
   2.成功生产parquet文件
   3.parquet文件如下:
    parquet-tools show 
V1.spark_load_test85.24165.24164.24166.0.544377767.parquet
   +------+----------+-------+
   |   id     | name      |   age |
   |------+----------+-------|
   |    1    | zhangsan |    14 |
   |    2    | lisi            |    19 |
   +------+----------+-------+
   
   
   
   parquet-tools inspect 
V1.spark_load_test85.24165.24164.24166.0.544377767.parquet
   
   ############ file meta data ############
   created_by: parquet-mr version 1.12.0 (build 
db75a6815f2ba1d1ee89d1a90aeb296f1f3a8f20)
   num_columns: 3
   num_rows: 2
   num_row_groups: 1
   format_version: 1.0
   serialized_size: 651
   
   
   ############ Columns ############
   id
   name
   age
   
   ############ Column(id) ############
   name: id
   path: id
   max_definition_level: 1
   max_repetition_level: 0
   physical_type: INT32
   logical_type: None
   converted_type (legacy): NONE
   compression: SNAPPY (space_saved: -5%)
   
   ############ Column(name) ############
   name: name
   path: name
   max_definition_level: 1
   max_repetition_level: 0
   physical_type: BYTE_ARRAY
   logical_type: String
   converted_type (legacy): UTF8
   compression: SNAPPY (space_saved: -4%)
   
   ############ Column(age) ############
   name: age
   path: age
   max_definition_level: 1
   max_repetition_level: 0
   physical_type: INT32
   logical_type: Int(bitWidth=8, isSigned=true)
   converted_type (legacy): INT_8
   compression: SNAPPY (space_saved: -5%)
   
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to