narrowizard opened a new pull request, #8677:
URL: https://github.com/apache/incubator-devlake/pull/8677
## Problem
#8676
### Root cause:
- Each column rename operation called HasColumn() twice (checking old and
new column names)
- Each HasColumn() call queries information_schema for all column metadata
- Each RenameColumn() triggered a deferred SELECT to clear PostgreSQL's
query plan cache
- Total: ~270 information_schema queries + 135 cache clearing SELECTs
## Solution
Optimized the migration by:
1. Cache column metadata: Fetch all column names once per table using
GetColumns() instead of calling HasColumn() for each column
2. Batch cache clearing: Clear PostgreSQL cached plan once per table
instead of after each rename
3. Handle dialect differences:
- PostgreSQL: Execute separate ALTER TABLE ... RENAME COLUMN statements
(required by syntax)
- MySQL: Use single ALTER TABLE with multiple CHANGE COLUMN clauses
## Performance Impact
| Metric | Before | After | Improvement |
|----------------------------|--------|-------|---------------|
| Execution time | 22s | 4s | 5.5x faster |
| information_schema queries | ~270 | 3 | 90x reduction |
| Cache clearing SELECTs | 135 | 3 | 45x reduction |
## Testing
### Performance comparison
| Database | Before (s) | After (s) |
|------------|------------|-----------|
| PostgreSQL | <img width="1898" height="162" alt="image"
src="https://github.com/user-attachments/assets/e71df5e0-45b7-466a-b519-bf33457ae2cd"
/> | <img width="1928" height="170" alt="image"
src="https://github.com/user-attachments/assets/959bd1b5-794f-44f1-9229-ee99488dc3d2"
/> |
| MySQL | <img width="1912" height="168" alt="image"
src="https://github.com/user-attachments/assets/9ba03617-5cba-4ce4-8d40-58ac2d262d27"
/> | <img width="1858" height="272" alt="image"
src="https://github.com/user-attachments/assets/9daf4834-75e9-498b-be84-b5c928158273"
/> |
### Schema validation
#### mysql
<img width="1104" height="4228" alt="image"
src="https://github.com/user-attachments/assets/4e01a3f4-fc21-439f-9a5c-e44d4964821e"
/>
<img width="1140" height="3548" alt="image"
src="https://github.com/user-attachments/assets/198ac9dd-4374-460a-9007-8e1aa5277ded"
/>
<img width="1096" height="5232" alt="image"
src="https://github.com/user-attachments/assets/d319e8be-0f09-4ef3-8a4f-f480bb1dd65a"
/>
#### Postgres
<img width="1260" height="4332" alt="image"
src="https://github.com/user-attachments/assets/f63c45e8-cd03-4157-b82c-76501f9d72b1"
/>
<img width="1240" height="5264" alt="image"
src="https://github.com/user-attachments/assets/421aa9f8-25ec-474c-8d6f-02efd41cf065"
/>
<img width="1216" height="3688" alt="image"
src="https://github.com/user-attachments/assets/2617e846-7c44-4712-9cec-1ec45ce92434"
/>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]