vatsrahul1001 commented on issue #63532:
URL: https://github.com/apache/airflow/issues/63532#issuecomment-4065575443

   > [@vatsrahul1001](https://github.com/vatsrahul1001) Thank you for testing! 
I've increased the `BATCH_SIZE` from 1,000 to 10,000 in the PR (both worked 
fine in my local PostgreSQL environment). Could you check if the 10,000 value 
is applied correctly, and also test with 1,000 and 100?
   > 
   > Even with 10,000 rows the string size is around 1MB, so I don't think it 
would be an issue. If errors still occur even with a `BATCH_SIZE` of 100, I'd 
suspect a different cause. (From my experiments, `BATCH_SIZE` of 1,000 or above 
was the optimal range.)
   > 
   > Also, it seems like the error message is not complete — could you share 
the full error message?
   
   The full error with the original batch size (1000) was a 
sqlalchemy.exc.StatementError — PostgreSQL rejected the query because the 
VALUES clause with 1000 literal (deadline_id::uuid, callback_id::uuid, missed) 
tuples made the SQL statement too large. Reducing the batch size would work 
around that specific error, but the fundamental approach still has scalability 
concerns
   
   What datasize are you testing with? suggest you try to around 10M dataset


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to