[
https://issues.apache.org/jira/browse/ARROW-14798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451449#comment-17451449
]
Will Jones commented on ARROW-14798:
------------------------------------
What do you think of adding an {{inner_window}} field to
{{{}PrettyPrintOptions{}}}? Then in the case of chunked_array, {{window}} would
refer to number of chunks to show and {{inner_window}} would refer to elements
to show within those chunks.
cc [~uwe]
Example of table with repr that's a little too long as-is:
{code:python}
import pyarrow
import string
def print_table(size: int, chunks: int):
t = pyarrow.table({
name: pyarrow.chunked_array([range(size) for _ in range(chunks)])
for name in string.ascii_lowercase
})
print(t)
{code}
> [Python] Limit the size of the repr for large Tables
> ----------------------------------------------------
>
> Key: ARROW-14798
> URL: https://issues.apache.org/jira/browse/ARROW-14798
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Joris Van den Bossche
> Priority: Major
> Fix For: 7.0.0
>
>
> The new repr is nice that it shows a preview of the data, but this can also
> become very long flooding your console output for larger tables.
> We already default to 10 preview cols, but each column can still consist of
> many chunks. So it might be good to also limit it to 2 chunks?
> The ChunkedArray.to_string method already has a {{window}} keyword, but that
> seems to control both the number of elements to show per chunk as the number
> of chunks (while it would be nice to limit eg to 2 chunks but show up to 10
> elements for each chunk).
> cc [~amol-]
--
This message was sent by Atlassian Jira
(v8.20.1#820001)