timsaucer opened a new pull request, #1036:
URL: https://github.com/apache/datafusion-python/pull/1036

   # Which issue does this PR close?
   
   None.
   
    # Rationale for this change
   
   The notebook rendering of DataFrames is very useful, but it can be enhanced. 
This PR adds quality of life improvements such as
   
   - The table is now scrollable both vertically and horizontally
   - Instead of collecting an arbitrary 10 rows, we collect up to 2 MB worth of 
data
   - For Scalars that render to long strings (25 characters) we limit them down 
and have a `...` button to allow expanding the cell so you can view it in it's 
entirety
   - When we have more data available than is displayed we indicate this to the 
user that the data are truncated
   - When there are no data returned, we write this to the user
   
   # What changes are included in this PR?
   
   This PR adds a feature to collect record batches and uses their size 
estimate to collect up to 2MB worth of data. This is typically enough for most 
use cases to review the data, but it is a constant we can update. We determine 
how many rows to show to the user which is either 2MB worth (record batch will 
easily have more than this) or at least 20 rows (also up for changing). We then 
render this as a html table
   
   In the rendering we see if the individual cell contains more than 25 
characters. If so we show a 25 character snippet of the string representation 
of the data and a `...` button that has a javascript call to update which data 
are displayed in the cell.
   
   # Are there any user-facing changes?
   
   Yes, but not to the API. Any user who uses jupyter notebooks will experience 
these enhanced tables.
   
   See the below screenshots for examples:
   
![table-views-1](https://github.com/user-attachments/assets/e7d4954f-33b4-43c5-82d9-832e31e2222c)
   <img width="1022" alt="table-views-2" 
src="https://github.com/user-attachments/assets/3098f9a4-f5a5-4658-a3f5-dd6ba7706e4b";
 />
   <img width="1127" alt="table-views-3" 
src="https://github.com/user-attachments/assets/c73a6118-75ea-4a40-9e50-2aa5718be03c";
 />
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to