kosiew opened a new pull request, #1119:
URL: https://github.com/apache/datafusion-python/pull/1119

   
   ## Which issue does this PR close?
   
   partial fix for #1078
   
   ## Rationale for this change
   
   This change improves the flexibility and performance of DataFrame rendering 
in notebooks and other environments.  
   It introduces fine-grained control over memory usage, row display counts, 
and HTML output optimization, making large data exploration more efficient and 
user-friendly.  
   It also cleans up validation logic for formatter settings and supports 
custom styling providers more robustly.
   
   ## What changes are included in this PR?
   
   - Added `max_memory_bytes`, `min_rows_display`, and `repr_rows` parameters 
to the DataFrame HTML formatter.
   - Updated Python `configure_formatter` API and documentation to expose new 
parameters.
   - Improved internal validation for formatter parameters 
(`_validate_positive_int`, `_validate_bool`).
   - Introduced `FormatterConfig` in Rust to carry display configuration across 
DataFrame rendering.
   - Updated Rust `collect_record_batches_to_display` to respect new memory and 
row limits dynamically.
   - New tests to cover memory limits, row controls, and style provider usage.
   - Documentation updates explaining memory and performance optimizations, 
including `use_shared_styles`.
   
   ## Are these changes tested?
   
   ✅ Yes, additional tests have been added:
   - Validation of new parameters in `test_html_formatter_memory_and_rows`.
   - Verification of custom style provider behavior combined with formatter 
parameters.
   - Edge case testing for extreme values (e.g., very high/low limits).
   
   ## Are there any user-facing changes?
   
   ✅ Yes:
   - Users can now configure how much memory and how many rows are used when 
displaying DataFrames.
   - Improved error messages for invalid formatter configurations.
   - Better performance when rendering large numbers of DataFrames in Jupyter 
notebooks or other rich environments.
   - Documentation updated to reflect the new options available.
   
   ---
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to