xylaaaaa opened a new pull request, #61852:
URL: https://github.com/apache/doris/pull/61852

   ### What problem does this PR solve?
   
   Issue Number: None
   
   Related PR: None
   
   Problem Summary:
   The Hive external regress bootstrap path always prepared and loaded the full 
Hive dataset set for both Hive2 and Hive3. Even when a suite only exercised one 
Hive version, the startup scripts still decompressed, downloaded, copied, and 
executed version-specific data for the other version.
   
   This change adds explicit bootstrap groups and uses them to limit host-side 
data preparation and metastore-side initialization:
   - default Hive2 bootstrap groups: `common,hive2_only`
   - default Hive3 bootstrap groups: `common,hive3_only`
   - shared prepare phase merges only the groups needed by the versions started 
in the current job
   - version-specific manifests control which `run.sh`, downloads, and 
preinstalled HQL files are selected
   
   Shared assets remain in `common`. In particular, `test_compress_partitioned` 
stays shared because TVF cases read that HDFS path directly.
   
   Developer Impact:
   - Hive2 bootstrap skips Hive3-only data and HQL work
   - Hive3 bootstrap skips Hive2-only data and HQL work
   - pipeline scripts can also derive these groups through 
`regression-test/pipeline/common/get-hive-bootstrap-groups.sh`
   
   Root Cause:
   The previous implementation discovered bootstrap work through global `find` 
and unconditional download/copy steps, so every Hive job paid the full 
initialization cost.
   
   ### Release note
   
   None
   
   ### Check List (For Author)
   
   - Test: Manual test
       - `bash -n 
docker/thirdparties/docker-compose/hive/scripts/bootstrap/bootstrap-groups.sh`
       - `bash -n 
docker/thirdparties/docker-compose/hive/scripts/prepare-hive-data.sh`
       - `bash -n 
docker/thirdparties/docker-compose/hive/scripts/hive-metastore.sh`
       - `bash -n docker/thirdparties/run-thirdparties-docker.sh`
       - `bash -n regression-test/pipeline/common/get-hive-bootstrap-groups.sh`
       - helper smoke test for Hive2/Hive3 bootstrap group selection
   - Behavior changed: Yes (Hive bootstrap now skips version-specific data for 
the other Hive version)
   - Does this need documentation: No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to