N Gautam Animesh created ARROW-17796:
----------------------------------------
Summary: Using cbind when merging multi datasets using
open_dataset on a directory.
Key: ARROW-17796
URL: https://issues.apache.org/jira/browse/ARROW-17796
Project: Apache Arrow
Issue Type: Task
Reporter: N Gautam Animesh
I was wondering if we can use cbind stating particular column names when
merging multi datasets using open_dataset(), so that we can bind only those
particular cols.
I was using open_dataset to read multi datasets in a particular directory and
wanted to merge these multi datasets based on some particular columns that are
common to all the datasets.
Is it possible to merge these datasets column wise, since by default
open_dataset is merging all the datasets one after the other row-wise?
Do let me know if there's anything like this or any other work around.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)