Francois Saint-Jacques created ARROW-8065:
---------------------------------------------

             Summary: [C++][Dataset] Untangle Dataset, Fragment and ScanOptions
                 Key: ARROW-8065
                 URL: https://issues.apache.org/jira/browse/ARROW-8065
             Project: Apache Arrow
          Issue Type: Improvement
            Reporter: Francois Saint-Jacques


We should be able to list fragments without going through the 
Scanner/ScanOptions hoops. This exposes a flaw with the current API where it 
require a ScanOptions to create Fragment, this is also a problem for 
ARROW-7824, i.e. why do we need a ScanOptions (read manifest) to write record 
batches to a given path.
 # Remove {{ScanOptions}} from Fragment's properties and move it into 
{{Fragment::Scan}} parameters.
 # Remove {{ScanOptions}} from {{Dataset::GetFragments}}, if required, we can 
still provide an alternate signature, e.g. 
{{Dataset::GetFragments(std::shared_ptr<Expression> predicate)}} for sub-tree 
pruning in FileSystemDataset.
 # Fragment constructor should take a schema (and store it as a property), 
usually extracted from the Dataset schema. Update the schema() method 
accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to