seddonm1 commented on a change in pull request #9565:
URL: https://github.com/apache/arrow/pull/9565#discussion_r585040519
##########
File path: rust/datafusion/src/physical_plan/string_expressions.rs
##########
@@ -361,10 +534,167 @@ pub fn ltrim<T: StringOffsetSizeTrait>(args:
&[ArrayRef]) -> Result<ArrayRef> {
}
}
-/// Converts the string to all lower case.
-/// lower('TOM') = 'tom'
-pub fn lower(args: &[ColumnarValue]) -> Result<ColumnarValue> {
- handle(args, |x| x.to_ascii_lowercase(), "lower")
+/// Returns last n characters in the string, or when n is negative, returns
all but first |n| characters.
+/// right('abcde', 2) = 'de'
+pub fn right<T: StringOffsetSizeTrait>(args: &[ArrayRef]) -> Result<ArrayRef> {
+ let string_array: &GenericStringArray<T> = args[0]
+ .as_any()
+ .downcast_ref::<GenericStringArray<T>>()
+ .ok_or_else(|| {
+ DataFusionError::Internal("could not cast string to
StringArray".to_string())
+ })?;
+
+ let n_array: &Int64Array =
+ args[1]
+ .as_any()
+ .downcast_ref::<Int64Array>()
+ .ok_or_else(|| {
+ DataFusionError::Internal("could not cast n to
Int64Array".to_string())
+ })?;
+
+ let result = string_array
+ .iter()
+ .enumerate()
+ .map(|(i, x)| {
+ if n_array.is_null(i) {
+ None
+ } else {
+ x.map(|x: &str| {
+ let n: i64 = n_array.value(i);
Review comment:
Hi @jorgecarleitao thanks for this.
My understanding is that one of the core properties of a `RecordBatch` is
that all columns must have the same length:
https://github.com/apache/arrow/blob/master/rust/arrow/src/record_batch.rs#L52
implemented here:
https://github.com/apache/arrow/blob/master/rust/arrow/src/record_batch.rs#L134
From what I can see, if we did adopt a `zip` then we would implicitly be
treating the shorter argument as a `None` which wont break the out of bounds
check but might produce some very strange function results.
I do agree with you that many of the core Rust Arrow implementations are
throwing away the benefits of the Rust compiler so we should try to sensibly
refactor for safety.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]