[ 
https://issues.apache.org/jira/browse/ARROW-15258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17482093#comment-17482093
 ] 

Weston Pace commented on ARROW-15258:
-------------------------------------

I'd like to avoid ScanOptions entirely but I'm not opposed to using 
InMemoryDataset.

Filter should not be needed (this is useful when scanning only if we can push 
the filter down to reduce the amount of data we read from disk, otherwise a 
filter node is sufficient).

Projection should not be needed (this is useful when scanning only if we can 
push the projection down to reduce the amount of data we read from disk, e.g. 
which columns we want to read from disk.  Otherwise a project node is 
sufficient).

The only parameter that probably makes sense is batch size.

If you want to use InMemoryDataset then that is one possible implementation.  
You can just hide the creation of ScanOptions from the user and create your own 
default ScanOptions with the default projection and no filter.

Otherwise you can create a record batch reader from a table and I think we have 
examples of how to expose a record batch reader as a generator but you would 
need to do your own slicing (for batch size) on top of that.

> [C++] Easy options to create a source node from a table
> -------------------------------------------------------
>
>                 Key: ARROW-15258
>                 URL: https://issues.apache.org/jira/browse/ARROW-15258
>             Project: Apache Arrow
>          Issue Type: Sub-task
>          Components: C++
>            Reporter: Weston Pace
>            Assignee: Vibhatha Lakmal Abeykoon
>            Priority: Major
>
> Given a Table there should be a very simple way to create a source node.  
> Something like:
> {code}
>   std::shared_ptr<Table> table = ...
>   ARROW_RETURN_NOT_OK(arrow::compute::MakeExecNode(
>       "table", plan, {}, arrow::compute::TableSourceOptions{table.get()}));
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to