[jira] [Commented] (ARROW-2358) API for Writing to Multiple Feather Files
[ https://issues.apache.org/jira/browse/ARROW-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16511574#comment-16511574 ] Dhruv Madeka commented on ARROW-2358: - Got it! I think I can handle that > API for Writing to Multiple Feather Files > - > > Key: ARROW-2358 > URL: https://issues.apache.org/jira/browse/ARROW-2358 > Project: Apache Arrow > Issue Type: New Feature > Components: C, C++, Python >Affects Versions: 0.9.0 >Reporter: Dhruv Madeka >Priority: Minor > > It would be really great to have an API which can write a Table to a > `FeatherDataset`. Essentially, taking a name for a file - it would split the > table into N-equal parts (which could be determined by the user or the code) > and then write the data to N files with a suffix (which is `_part` by default > but could be user specificed). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2358) API for Writing to Multiple Feather Files
[ https://issues.apache.org/jira/browse/ARROW-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16511556#comment-16511556 ] Wes McKinney commented on ARROW-2358: - There is a slice function on Column, so implementing {{Table::Slice}} should be reasonably straightforward if you're OK adding some C++ code and adding it to the Python bindings: https://github.com/apache/arrow/blob/master/cpp/src/arrow/table.h#L125. I just opened ARROW-2707 about this > API for Writing to Multiple Feather Files > - > > Key: ARROW-2358 > URL: https://issues.apache.org/jira/browse/ARROW-2358 > Project: Apache Arrow > Issue Type: New Feature > Components: C, C++, Python >Affects Versions: 0.9.0 >Reporter: Dhruv Madeka >Priority: Minor > > It would be really great to have an API which can write a Table to a > `FeatherDataset`. Essentially, taking a name for a file - it would split the > table into N-equal parts (which could be determined by the user or the code) > and then write the data to N files with a suffix (which is `_part` by default > but could be user specificed). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2358) API for Writing to Multiple Feather Files
[ https://issues.apache.org/jira/browse/ARROW-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510611#comment-16510611 ] Dhruv Madeka commented on ARROW-2358: - [~wesmckinn] So Im good to submit a PR. Its just not obvious how to do this without a `slice` function for a table. Would you advise I implement that first and then the FeatherDataset writer? > API for Writing to Multiple Feather Files > - > > Key: ARROW-2358 > URL: https://issues.apache.org/jira/browse/ARROW-2358 > Project: Apache Arrow > Issue Type: New Feature > Components: C, C++, Python >Affects Versions: 0.9.0 >Reporter: Dhruv Madeka >Priority: Minor > > It would be really great to have an API which can write a Table to a > `FeatherDataset`. Essentially, taking a name for a file - it would split the > table into N-equal parts (which could be determined by the user or the code) > and then write the data to N files with a suffix (which is `_part` by default > but could be user specificed). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2358) API for Writing to Multiple Feather Files
[ https://issues.apache.org/jira/browse/ARROW-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510193#comment-16510193 ] Wes McKinney commented on ARROW-2358: - This sounds useful to me. I removed this from the 0.10 milestone for now; please feel free to submit a PR > API for Writing to Multiple Feather Files > - > > Key: ARROW-2358 > URL: https://issues.apache.org/jira/browse/ARROW-2358 > Project: Apache Arrow > Issue Type: New Feature > Components: C, C++, Python >Affects Versions: 0.9.0 >Reporter: Dhruv Madeka >Priority: Minor > > It would be really great to have an API which can write a Table to a > `FeatherDataset`. Essentially, taking a name for a file - it would split the > table into N-equal parts (which could be determined by the user or the code) > and then write the data to N files with a suffix (which is `_part` by default > but could be user specificed). -- This message was sent by Atlassian JIRA (v7.6.3#76005)