GitHub user yjhjstz created a discussion: [Proposal] Enhanced ORCA Parallel Planning to Align with PostgreSQL Planner
### Proposers @yjhjstz ### Proposal Status Under Discussion ### Abstract ORCA (Pivotal Query Optimizer) currently has limited parallel planning capabilities, This creates an inconsistency where: - PostgreSQL planner can generate comprehensive parallel plans - ORCA lacks equivalent parallel planning sophistication - Users must disable ORCA (set optimizer=off) to fully utilize parallel features ### Motivation Extend ORCA to generate parallel execution plans that align with PostgreSQL's parallel planning approach, while maintaining compatibility with Cloudberry's MPP architecture. ### Implementation Extend ORCA's path generation to create parallel-aware operators: - CPhysicalParallelSeqScan - Parallel sequential scans - CPhysicalParallelIndexScan - Parallel index scans - CPhysicalParallelBitmapHeapScan - Parallel bitmap heap scans - CPhysicalParallelHash - Parallel hash operations - CPhysicalParallelHashJoin - Parallel hash joins - CPhysicalParallelAgg - Parallel aggregation - CPhysicalParallelSort - Parallel sorting 2. Parallel-Aware Join Planning Implement parallel join strategies similar to PostgreSQL: // Parallel hash join with shared hash table class CPhysicalParallelHashJoin : public CPhysicalHashJoin { // Enable parallel-aware hash table sharing // Handle worker coordination for hash table building // Manage locus for HashedWorkers distribution }; 3. Parallel Cost Model Integration Enhance ORCA's cost model to account for parallel execution: - CPU cost reduction based on parallel_workers - Memory cost adjustments for shared resources - I/O cost distribution across workers - Startup cost penalties for worker coordination 4. Parallel Motion Nodes ### Rollout/Adoption Plan Benefits 1. Performance Consistency - Users get parallel execution regardless of optimizer choice 2. Feature Parity - ORCA matches PostgreSQL planner capabilities 3. Enhanced Scalability - Better utilization of multi-core systems 4. Simplified Configuration - No need to disable ORCA for parallel workloads 5. Boost tpcds, tpch ### Are you willing to submit a PR? - [X] Yes I am willing to submit a PR! GitHub link: https://github.com/apache/cloudberry/discussions/1316 ---- This is an automatically sent email for dev@cloudberry.apache.org. To unsubscribe, please send an email to: dev-unsubscr...@cloudberry.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cloudberry.apache.org For additional commands, e-mail: dev-h...@cloudberry.apache.org