jorisvandenbossche commented on a change in pull request #11447:
URL: https://github.com/apache/arrow/pull/11447#discussion_r731565164
##########
File path: python/pyarrow/_fs.pyx
##########
@@ -833,6 +833,12 @@ cdef class SubTreeFileSystem(FileSystem):
FileSystem.init(self, wrapped)
self.subtreefs = <CSubTreeFileSystem*> wrapped.get()
+ def __str__(self):
+ return f'SubTreeFileSystem: file:{self.base_path}'
Review comment:
> Does it make sense to omit `subfs.base_fs` from `__repr__` in this
case?
I would certainly keep it, as that seems an essential part to understand a
SubtreeFilesystem object (i.e. what kind of filesystem is it wrapping)
> Ideally,
[`__repr__`](https://docs.python.org/3/library/functions.html#repr) differs
from `__str__` in that it allows "reconstruction" (eval) of an equivalent
object. But this would be difficult for `SubTreeFileSystem` and I do not think
Arrow has an established convention for representing Python objects.
Indeed, if making a distinction, that's the typical rule to differentiate
repr and str. But making an "eval"-able repr is not always easy / possible. We
don't generally do that in pyarrow (eg Array, Table, RecordBatch etc don't have
a separate repr). For FileSystems it might be possible though. But I would
defer that to later (because it requires updating the repr of other
filesystems, see note below).
> Would it make sense to put this PR on hold, and first add
`__str__/__repr__` to `FileSystem`. Then circle back to this PR and build on
top of that information? cc @jorisvandenbossche @ianmcook
We could certainly improve the str/reprs of the other FileSystems as well
(and we should open a JIRA for it). But I don't think that need to hold up this
PR. For example, the repr of LocalFileSystem is already informative, but only
contains some noise. While SubtreeFileSystem is really lacking information.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]