Re: [I] Enhance `repr` and `_repr_html_` with a note for additional rows [datafusion-python]

via GitHub Thu, 27 Feb 2025 21:05:58 -0800


Spaarsh commented on issue #1026:
URL: 
https://github.com/apache/datafusion-python/issues/1026#issuecomment-2689730430


   I'd like to work on this issue. Adding a few lines of code along the lines 
of:
   ```
   fn __repr__(&self, py: Python) -> PyDataFusionResult<String> {
       let df = self.df.as_ref().clone().limit(0, Some(11))?;
       let batches = wait_for_future(py, df.collect())?;
       let num_rows = batches.iter().map(|batch| 
batch.num_rows()).sum::<usize>();
       let limited_batches = 
batches.iter().take(10).cloned().collect::<Vec<_>>();
       let batches_as_string = pretty::pretty_format_batches(&limited_batches);
   
       match batches_as_string {
           Ok(batch) => {
               if num_rows > 10 {
                   Ok(format!("DataFrame()\n{batch}\nand more..."))
               } else {
                   Ok(format!("DataFrame()\n{batch}"))
               }
           }
           Err(err) => Ok(format!("Error: {:?}", err.to_string())),
       }
   }
   ```
   
   Should suffice, I suppose?
   
   > You could also implement a "config" system like pandas uses, so the user 
can opt-in to displaying more columns or rows 
https://pandas.pydata.org/docs/user_guide/options.html#overview
   
   As for the config, we'd need to decide on a particular format. I would 
suggest ```toml``` since it is used by ```Cargo```. But that in itself requires 
a new issue since I am sure there can be a host of other things that could 
benefit from this system.
   
   We could start from this issue itself too if it is alright.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] Enhance `__repr__` and `_repr_html_` with a note for additional rows [datafusion-python]

Reply via email to

Re: [I] Enhance `repr` and `_repr_html_` with a note for additional rows [datafusion-python]