alamb opened a new issue, #7698: URL: https://github.com/apache/arrow-rs/issues/7698
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** @scovich noted https://github.com/apache/arrow-rs/pull/7653#discussion_r2147021588 > It's a bit unfortunate we have to double-allocate strings. Ideally the dictionary could track &str backed by dict_keys, but I don't know how to manage mutability and lifetimes to achieve that... The basic question / observation is that building up the dictionary will likely be a performance bottleneck. For example something like this will likely be pretty slow: ```rust let mut builder = VariantBuilder:new(); let mut object = builder.new_object(); for i in 0..10000 { object.set_value(&format!("foo{i}", 42) } ``` **Describe the solution you'd like** 1. Figure out a more performant way to manage the dictionary 2. Also figure out how to create sorted dictionaries **Describe alternatives you've considered** @zeroshade described a pretty elegant solution in the Go variant builder which was to write the dictionary fields in the order they are encountered., and then decides if the dictionary is sorted when writing it out. The builder has an additional API that you can use to directly add a dictionary key One benefit of this approach is that different writers can decide if they care about sorted dictionaries and if they do they can do a pre-pass to figure them out and add them directly to the builder For example this might look like ```rust let mut builder = VariantBuilder:new(); // somehow we know that the object will only have keys "a", "b" and "c" builder.add_field_name("a"); builder.add_field_name("b"); builder.add_field_name("c"); // as long as the objects don't add a new field name, the keys remain sorted let mut object = builder.new_object(); builder.set_field("b", 4); builder.set_field("c", 5); builder.set_field("a", 6); // The final variant will have sorted dictionaries let (metadata, value) = builder.build(); let variant_metadata = VariantMetadata::try_new(&metadata).uwnrap() assert(variant_metadata.is_sorted()) ``` **Additional context** <!-- Add any other context or screenshots about the feature request here. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
