[ 
https://issues.apache.org/jira/browse/ARROW-12676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Kavanagh updated ARROW-12676:
----------------------------------
    Issue Type: Bug  (was: New Feature)

> RecordBatchBuilder with uint dictionary creates signed int Batch
> ----------------------------------------------------------------
>
>                 Key: ARROW-12676
>                 URL: https://issues.apache.org/jira/browse/ARROW-12676
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>    Affects Versions: 3.0.0
>            Reporter: Kyle Kavanagh
>            Priority: Major
>
> When a RecordBatchBuilder with a dictionary type w/ a uint32 index is flushed 
> to a batch, the resulting batch contains a int32 index:
> {code:java}
> BatchBuilder schema after flush: 
> Symbol: dictionary<values=string, indices=int16, ordered=0>
> Status: dictionary<values=string, indices=uint32, ordered=0>{code}
> {code:java}
> Batch schema after flush:
> Symbol: dictionary<values=string, indices=int16, ordered=0>
> Status: dictionary<values=string, indices=int32, ordered=0>
> {code}
> from:
> {code:java}
> std::shared_ptr<arrow::RecordBatch> batch;  
> auto status = batchBuilder_>Flush(&batch);  
> std::cout<<"BatchBuilder schema after flush: 
> "<<batchBuilder_->schema()->ToString()<<std::endl;  
> std::cout<<"Batch schema after flush: 
> "<<batch->schema()->ToString()<<std::endl;  
> if(!status.ok()) {    throw Exception("Arrow batch flush failed: {}", 
> status);  }{code}
> This results in a failure to write: "Invalid: Tried to write record batch 
> with different schema"
> I believe this is related to https://issues.apache.org/jira/browse/ARROW-9969 
> and in particular, this bit: 
> [https://github.com/apache/arrow/blob/master/cpp/src/arrow/table_builder.cc#L72]
> Is the dictionary->Equals comparison checking the signed-ness of the indices?
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to