[ 
https://issues.apache.org/jira/browse/ARROW-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17538340#comment-17538340
 ] 

Ariana Villegas edited comment on ARROW-14314 at 5/17/22 7:46 PM:
------------------------------------------------------------------

{quote}{{transformed_indices = take(transformed_sort_idx, indices)}} and 
{{sort_indices = sort_indices(transformed_indices)}} ?
{quote}
Yes.
{quote}It should also get sort_indices = [0, 3, 4, 5, 1, 2] as an answer. Does 
it?
{quote}
Right now, it doesn't, it result is: [0, 3, 4, 5, 2, 1]

But to avoid that, we can replace null values into indices, so the problem will 
look like this:

values: ['a', null, 'b', 'c']

indices: [0, null, null, 0, 2, 3]

 

[~apitrou], btw, why do we allow nulls in values? Shouldn't it be easier to 
have them only in indices?


was (Author: JIRAUSER280694):
{quote}{{transformed_indices = take(transformed_sort_idx, indices)}} and 
{{sort_indices = sort_indices(transformed_indices)}} ?
{quote}
Yes.
{quote}It should also get sort_indices = [0, 3, 4, 5, 1, 2] as an answer. Does 
it?
{quote}

Right now, it doesn't, it result is: [0, 3, 4, 5, 2, 1]

But to avoid that, we can replace null values into indices, so the problem will 
look like this:

values: ['a', null, 'b', 'c']

indices: [0, null, null, 0, 2, 3]

> [C++] Sorting dictionary array not implemented
> ----------------------------------------------
>
>                 Key: ARROW-14314
>                 URL: https://issues.apache.org/jira/browse/ARROW-14314
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Neal Richardson
>            Priority: Major
>              Labels: kernel
>             Fix For: 9.0.0
>
>
> From R, taking the stock {{mtcars}} dataset and giving it a dictionary type 
> column:
> {code}
> mtcars %>% 
>   mutate(cyl = as.factor(cyl)) %>% 
>   Table$create() %>% 
>   arrange(cyl) %>% 
>   collect()
> Error: Type error: Sorting not supported for type dictionary<values=string, 
> indices=int8, ordered=0>
> ../src/arrow/compute/kernels/vector_array_sort.cc:427  VisitTypeInline(type, 
> this)
> ../src/arrow/compute/kernels/vector_sort.cc:148  
> GetArraySorter(*physical_type_)
> ../src/arrow/compute/kernels/vector_sort.cc:1206  sorter.Sort()
> ../src/arrow/compute/api_vector.cc:259  CallFunction("sort_indices", {datum}, 
> &options, ctx)
> ../src/arrow/compute/exec/order_by_impl.cc:53  SortIndices(table, options_, 
> ctx_)
> ../src/arrow/compute/exec/sink_node.cc:292  impl_->DoFinish()
> ../src/arrow/compute/exec/exec_plan.cc:297  iterator_.Next()
> ../src/arrow/record_batch.cc:318  ReadNext(&batch)
> ../src/arrow/record_batch.cc:329  ReadAll(&batches)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to