arthurpassos commented on code in PR #35825:
URL: https://github.com/apache/arrow/pull/35825#discussion_r1210580352


##########
cpp/src/arrow/array/builder_dict.h:
##########
@@ -724,6 +747,7 @@ using BinaryDictionaryBuilder = 
DictionaryBuilder<BinaryType>;
 using StringDictionaryBuilder = DictionaryBuilder<StringType>;
 using BinaryDictionary32Builder = Dictionary32Builder<BinaryType>;
 using StringDictionary32Builder = Dictionary32Builder<StringType>;
+using BinaryDictionary64Builder = Dictionary64Builder<LargeBinaryType>;

Review Comment:
   Hi @pitrou. First of all, thanks for looking into this.
   
   I am trying to fix the issue described in 
https://github.com/apache/arrow/issues/32723. It's an issue that pops up when 
data of complex data structures end up being chunked. The goal of this PR is to 
introduce a setting that'll allow the use of LARGE* variants of string / binary 
types to avoid chunking as suggested by @emkornfield.
   
   I simply followed my intuiton that if the non-large types rely on 32bits, 
the large type would rely on 64bits.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to