wjones127 commented on a change in pull request #12148:
URL: https://github.com/apache/arrow/pull/12148#discussion_r786980475
##########
File path: python/pyarrow/table.pxi
##########
@@ -1342,10 +1344,11 @@ cdef class Table(_PandasConvertible):
if preview_cols:
pieces.append('----')
for i in range(min(self.num_columns, preview_cols)):
- pieces.append('{}: {}'.format(
- self.field(i).name,
- self.column(i).to_string(indent=0, skip_new_lines=True)
- ))
+ col_string = self.column(i).to_string(
+ indent=0, skip_new_lines=True)
+ if len(col_string) > cols_char_limit:
+ col_string = col_string[:(cols_char_limit - 3)] + '...'
+ pieces.append('{}: {}'.format(self.field(i).name, col_string))
Review comment:
Thanks for the feedback. I implemented a very basic version of this for
now. This looks pretty good for this example:
```python
>>> from random import sample, choice
>>> import pyarrow as pa
>>> arr_int = pa.array(range(50))
>>> tree_parts = ["roots", "trunk", "crown", "seeds"]
>>> arr_list = pa.array([sample(tree_parts,
k=choice(range(len(tree_parts)))) for _ in range(50)])
>>> arr_struct = pa.StructArray.from_arrays([arr_int, arr_list],
names=['int_nested', 'list_nested'])
>>> arr_map = pa.array(
... [
... [(part, choice(range(10))) for part in sample(tree_parts,
k=choice(range(len(tree_parts))))]
... for _ in range(50)
... ],
... type=pa.map_(pa.utf8(), pa.int64())
... )
>>> table = pa.table({
... 'int': pa.chunked_array([arr_int] * 10),
... 'list': pa.chunked_array([arr_list] * 10),
... 'struct': pa.chunked_array([arr_struct] * 10),
... 'map': pa.chunked_array([arr_map] * 10),
... })
>>> print(table)
pyarrow.Table
int: int64
list: list<item: string>
child 0, item: string
struct: struct<int_nested: int64, list_nested: list<item: string>>
child 0, int_nested: int64
child 1, list_nested: list<item: string>
child 0, item: string
map: map<string, int64>
child 0, entries: struct<key: string not null, value: int64> not null
child 0, key: string not null
child 1, value: int64
----
int:
[[0,1,2,3,4,5,6,7,8,9,...,40,41,42,43,44,45,46,47,48,49],[0,1,2,3,4,5,6,7,8,9,...,40,41,42,43,44,45,46,47,48,49],[0,1,2,3,...]...]
list:
[[["seeds","trunk","roots"],["trunk","crown"],["crown"],["trunk"],["crown"],[],["roots","seeds"],["roots"],["trunk","roots"]...]...]
struct: [ -- is_valid: all not null -- child 0 type: int64
[
0,
1,
2,
3,
4,
5,
6,...]...]
map: [[ keys:["seeds","crown","trunk"]values:[7,8,7],
keys:["roots","crown"]values:[8,4], keys:["crown","roots","trunk"]...]...]
```
The unfortunate thing is it will have bad behavior in the case of string
columns containing `[`. For example,
```python
>>> pa.table({'x': pa.array(["[" * 100]* 500)})
x:
[["[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[","[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[",...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]...]
...]...]...]...]...]...]
```
I think that kind of behavior is pretty unavoidable until we push this limit
into the PrettyPrinter implementation itself.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]