[ 
https://issues.apache.org/jira/browse/ARROW-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17435669#comment-17435669
 ] 

Weston Pace commented on ARROW-14254:
-------------------------------------

> The prop version may not give an exact percentage

This is the only reason you would need two scans (assuming the user provided a 
percentage).

If the user provided an exact number then you are correct.  Top-k alone is fine.

> [C++] Return a random sample of rows from a query
> -------------------------------------------------
>
>                 Key: ARROW-14254
>                 URL: https://issues.apache.org/jira/browse/ARROW-14254
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Nicola Crane
>            Priority: Major
>              Labels: kernel, query-engine
>             Fix For: 7.0.0
>
>
> Please can we have a kernel that returns a random sample of rows? We've had a 
> request to be able to do this in R: 
> https://github.com/apache/arrow-cookbook/issues/83



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to