alamb opened a new issue #1299:
URL: https://github.com/apache/arrow-rs/issues/1299


   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   I am trying to create a dictionary array where I already know its keys and 
values so that I can ensure that the dictionaries are actually shared or I can 
avoid rebuilding a dictionary when I already have dictionary encoded data.
   
   Right now if you need to do this, you end up with code that is fairly messy, 
as you can see in 
https://github.com/apache/arrow-rs/pull/1263#discussion_r803783498:
   
   ```rust
   
   ```
     fn get_dict_arraydata(
           keys: Buffer,
           key_type: DataType,
           value_data: ArrayData,
       ) -> ArrayData {
           let value_type = value_data.data_type().clone();
           let dict_data_type =
               DataType::Dictionary(Box::new(key_type), Box::new(value_type));
           ArrayData::builder(dict_data_type)
               .len(3)
               .add_buffer(keys)
               .add_child_data(value_data)
               .build()
               .unwrap()
       }
   
       #[test]
       fn test_eq_dyn_dictionary_i8_array() {
           let key_type = DataType::Int8;
           // Construct a value array
           let value_data = ArrayData::builder(DataType::Int8)
               .len(8)
               .add_buffer(Buffer::from(
                   &[10_i8, 11, 12, 13, 14, 15, 16, 17].to_byte_slice(),
               ))
               .build()
               .unwrap();
   
           let keys1 = Buffer::from(&[2_i8, 3, 4].to_byte_slice());
           let keys2 = Buffer::from(&[2_i8, 4, 4].to_byte_slice());
           let dict_array1: DictionaryArray<Int8Type> = 
Int8DictionaryArray::from(
               get_dict_arraydata(keys1, key_type.clone(), value_data.clone()),
           );
           let dict_array2: DictionaryArray<Int8Type> =
               Int8DictionaryArray::from(get_dict_arraydata(keys2, key_type, 
value_data));
   
           let result = eq_dyn(&dict_array1, &dict_array2);
           assert!(result.is_ok());
           assert_eq!(result.unwrap(), BooleanArray::from(vec![true, false, 
true]));
       }
   ```
   
   **Describe the solution you'd like**
   It would be nice to have a way to create a DictionaryArray directly from the 
key and values
   
   ```
   let dict_array1 = DictionaryArray<Int8Type>::try_new(keys1, 
values.clone()).unwrap();
   ```
   
   So the entire test would look like
   ```rust
       #[test]
       fn test_eq_dyn_dictionary_i8_array() {
           let key_type = DataType::Int8;
           // Construct a value array
           let value_data = Int8Array::from_iter_values([10_i8, 11, 12, 13, 14, 
15, 16, 17]);
   
           let keys1 = Int8Array::from_iter_values([2_i8, 3, 4]);
           let keys2 = Int8Array::from_iter_values([2_i8, 4, 4]);
           let dict_array1 = DictionaryArray<Int8Type>::try_new(keys1, 
values.clone()).unwrap();
           let dict_array2 = DictionaryArray<Int8Type>::try_new(keys1, values);
   
           let result = eq_dyn(&dict_array1, &dict_array2);
           assert!(result.is_ok());
           assert_eq!(result.unwrap(), BooleanArray::from(vec![true, false, 
true]));
       }
   ```
   
   **Describe alternatives you've considered**
   A clear and concise description of any alternative solutions or features 
you've considered.
   
   **Additional context**
   Add any other context or screenshots about the feature request here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to