Taragolis commented on code in PR #36817:
URL: https://github.com/apache/airflow/pull/36817#discussion_r1456304434
##########
airflow/providers/google/CHANGELOG.rst:
##########
@@ -26,6 +26,14 @@
Changelog
---------
+The default value of ``parquet_row_group_size`` in ``BaseSQLToGCSOperator``
has changed from 1 to
+100000, in order to have a default that provides better compression efficiency
and performance of
+reading the data in the output Parquet files. In many cases, the previous
value of 1 resulted in
+very large files, long task durations and out of memory issues. A default
value of 100000 may require
+more memory to execute the operator, in which case users can override the
``parquet_row_group_size``
+parameter in the operator. All operators that are derived from
``BaseSQLToGCSOperator`` are affected
+when ``export_format`` is ``parquet``: ``MySQLToGCSOperator``,
``PrestoToGCSOperator``,
+``OracleToGCSOperator``, ``TrinoToGCSOperator``, ``MSSQLToGCSOperator`` and
``PostgresToGCSOperator``.
Review Comment:
```suggestion
.. note::
The default value of ``parquet_row_group_size`` in
``BaseSQLToGCSOperator`` has changed from 1 to
100000, in order to have a default that provides better compression
efficiency and performance of
reading the data in the output Parquet files. In many cases, the previous
value of 1 resulted in
very large files, long task durations and out of memory issues. A default
value of 100000 may require
more memory to execute the operator, in which case users can override the
``parquet_row_group_size``
parameter in the operator. All operators that are derived from
``BaseSQLToGCSOperator`` are affected
when ``export_format`` is ``parquet``: ``MySQLToGCSOperator``,
``PrestoToGCSOperator``,
``OracleToGCSOperator``, ``TrinoToGCSOperator``, ``MSSQLToGCSOperator``
and ``PostgresToGCSOperator``.
```
This would make it more noticeable in Changelog

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]