danepitkin commented on code in PR #34980:
URL: https://github.com/apache/arrow/pull/34980#discussion_r1166117745
##########
python/pyarrow/table.pxi:
##########
@@ -1450,8 +1450,129 @@ cdef _sanitize_arrays(arrays, names, schema, metadata,
converted_arrays.append(item)
return converted_arrays
+cdef class _Tabular(_PandasConvertible):
+ """Internal: An interface for common operations on tabular objects."""
-cdef class RecordBatch(_PandasConvertible):
+ def __init__(self):
+ raise TypeError("This object is not instantiable, "
+ "use a subclass instead.")
+
+ def __repr__(self):
+ if not self._is_initialized():
+ raise ValueError("This object's internal pointer is NULL, do not "
Review Comment:
Do we want this `ValueError`? `Table` had it, but `RecordBatch` didn't. It
seems superfluous IMO. I had to add `_is_initialized()` so each subclass could
implement checking its C++ object for validity.
##########
python/pyarrow/table.pxi:
##########
@@ -1696,15 +1820,10 @@ cdef class RecordBatch(_PandasConvertible):
except TypeError:
return NotImplemented
- def to_string(self, show_metadata=False):
- # Use less verbose schema output.
- schema_as_string = self.schema.to_string(
- show_field_metadata=show_metadata,
- show_schema_metadata=show_metadata
- )
- return 'pyarrow.{}\n{}'.format(type(self).__name__, schema_as_string)
-
def __repr__(self):
+ # TODO remove this and update pytests/doctests for
Review Comment:
I will follow up with a subsequent diff. When I remove this, `RecordBatch`
prints out partial tabular data like `Table`, but a bunch of doctests need to
be updated so I felt its better done in a separate diff.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]