jorisvandenbossche commented on a change in pull request #11455:
URL: https://github.com/apache/arrow/pull/11455#discussion_r732820284
##########
File path: python/pyarrow/parquet.py
##########
@@ -687,7 +687,43 @@ def __exit__(self, *args, **kwargs):
# return false since we want to propagate exceptions
return False
+ def write(self, table_or_batch):
+ """
+ Write RecordBatch or Table to stream.
Review comment:
The "to stream", is that from a copy-paste from the RecordBatchWriter
docs?
For writing a parquet file, that sounds a bit strange.
##########
File path: python/pyarrow/parquet.py
##########
@@ -687,7 +687,43 @@ def __exit__(self, *args, **kwargs):
# return false since we want to propagate exceptions
return False
+ def write(self, table_or_batch):
+ """
+ Write RecordBatch or Table to stream.
Review comment:
(and the same for `write_batch` / `write_table` below)
##########
File path: python/pyarrow/parquet.py
##########
@@ -687,7 +687,43 @@ def __exit__(self, *args, **kwargs):
# return false since we want to propagate exceptions
return False
+ def write(self, table_or_batch):
+ """
+ Write RecordBatch or Table to stream.
+
+ Parameters
+ ----------
+ table_or_batch : {RecordBatch, Table}
+ """
+ if isinstance(table_or_batch, pa.RecordBatch):
+ self.write_batch(table_or_batch)
+ elif isinstance(table_or_batch, pa.Table):
+ self.write_table(table_or_batch)
+ else:
+ raise ValueError(type(table_or_batch))
+
+ def write_batch(self, batch):
+ """
+ Write RecordBatch to stream.
+
+ Parameters
+ ----------
+ batch : RecordBatch
+ """
+ table = pa.Table.from_batches([batch], batch.schema)
+ self.write_table(table)
+
def write_table(self, table, row_group_size=None):
+ """
+ Write Table to stream in (contiguous) RecordBatch objects.
+
+ Parameters
+ ----------
+ table : Table
+ max_chunksize : int, default None
Review comment:
This is probably also from the RecordBatchWriter docs. For parquet,
there is a `row_group_size` keyword that was not documented, though.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]