alamb commented on a change in pull request #1296: URL: https://github.com/apache/arrow-rs/pull/1296#discussion_r804061136
########## File path: arrow/src/array/iterator.rs ########## @@ -403,6 +403,109 @@ impl<'a, S: OffsetSizeTrait> std::iter::ExactSizeIterator { } +/// an iterator that returns Some(T) or None, that can be used on any DictionaryArray +// Note: This implementation is based on std's [Vec]s' [IntoIter]. +#[derive(Debug)] +pub struct DictionaryIter<'a, K: ArrowPrimitiveType, T: ArrowPrimitiveType> { + array: &'a DictionaryArray<K>, + values: &'a PrimitiveArray<T>, + current: usize, + current_end: usize, +} Review comment: 🤔 as you note, the way this is implemented, it will only work for DictionaryArrays whose values are `PrimitiveArray<>` such as `Int64Array` If we take this approach, we'll probably end up with different iterator types for different value types like ```rust pub struct DictionaryIterPrimitive<'a, K: ArrowPrimitiveType, T: ArrowPrimitiveType> { ... } impl<'a, K: ArrowPrimitiveType, T: ArrowPrimitiveType> std::iter::Iterator for DictionaryIter<'a, K, T> { type Item = Option<T::Native>; ... } ``` ```rust pub struct DictionaryIterBool<'a, K: ArrowPrimitiveType> { ... } impl <'a, K: ArrowPrimitiveType> std::iter::Iterator for DictionaryIterBool<'a, K> { type Item = Option<bool>; ... } ... ``` And then functions like the following ```rust impl DictionaryArray { pub fn iter_primitive<T>::(&self) -> DictionaryIterPrimitive ... } pub fn iter_bool<T>::(&self) -> DictionaryIterBool ... } } ``` Which would effectively require the user to know what type of values their dictionary held 🤔 ########## File path: arrow/src/array/iterator.rs ########## @@ -403,6 +403,109 @@ impl<'a, S: OffsetSizeTrait> std::iter::ExactSizeIterator { } +/// an iterator that returns Some(T) or None, that can be used on any DictionaryArray +// Note: This implementation is based on std's [Vec]s' [IntoIter]. +#[derive(Debug)] +pub struct DictionaryIter<'a, K: ArrowPrimitiveType, T: ArrowPrimitiveType> { + array: &'a DictionaryArray<K>, + values: &'a PrimitiveArray<T>, + current: usize, + current_end: usize, +} Review comment: Thinking some more we could potentially avoid parameterizing on the key type by always upcasting the indexes to `usize` (which has to happen anyways) before creating the iterator -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org