[ https://issues.apache.org/jira/browse/ARROW-6536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16931555#comment-16931555 ]
Antoine Pitrou commented on ARROW-6536: --------------------------------------- Then passing a schema becomes ambiguous: sometimes the user would like that exact schema, sometimes they would like to have the other columns too. The {{column_names}} and {{column_types}} duo allows both explicitly. > [C++] CSV reader accept schema > ------------------------------ > > Key: ARROW-6536 > URL: https://issues.apache.org/jira/browse/ARROW-6536 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ > Reporter: Neal Richardson > Priority: Major > Labels: csv, dataset > > The CSV reader lets you specify {{column_types}}, but this is an > {{unordered_map}} of column name and type. Why not accept a Schema instead? > Isn't that essentially an ordered map? Seems that if you took a Schema, some > of the validation of what's being passed in would already have been handled. > Plus, I suspect that the Datasets project will want to do even more with > passing a Schema (e.g. selecting a subset of columns). > Thoughts [~pitrou] [~fsaintjacques] [~bkietz]? -- This message was sent by Atlassian Jira (v8.3.2#803003)