nevi-me commented on a change in pull request #9469:
URL: https://github.com/apache/arrow/pull/9469#discussion_r574709598



##########
File path: rust/arrow/src/array/array.rs
##########
@@ -326,6 +327,174 @@ pub fn new_empty_array(data_type: &DataType) -> ArrayRef {
     let data = ArrayData::new_empty(data_type);
     make_array(Arc::new(data))
 }
+/// Creates a new array with null slots of `length`
+pub fn new_array_with_nulls(data_type: &DataType, length: usize) -> ArrayRef {

Review comment:
       Are we happy with this name, or should we use `new_null_array`?

##########
File path: rust/arrow/src/array/array.rs
##########
@@ -409,4 +578,42 @@ mod tests {
         assert_eq!(a.len(), 0);
         assert_eq!(a.value_offsets()[0], 0i32);
     }
+
+    #[test]
+    fn test_null_boolean() {
+        let array = new_array_with_nulls(&DataType::Boolean, 9);
+        let a = array.as_any().downcast_ref::<BooleanArray>().unwrap();
+        assert_eq!(a.len(), 9);
+        for i in 0..9 {
+            assert!(a.is_null(i));
+        }
+    }
+
+    #[test]
+    fn test_null_primitive() {
+        let array = new_array_with_nulls(&DataType::Int32, 9);
+        let a = array.as_any().downcast_ref::<Int32Array>().unwrap();
+        assert_eq!(a.len(), 9);
+        for i in 0..9 {
+            assert!(a.is_null(i));
+        }
+    }
+
+    #[test]
+    fn test_null_variable_sized() {
+        let array = new_array_with_nulls(&DataType::Utf8, 9);
+        let a = array.as_any().downcast_ref::<StringArray>().unwrap();
+        assert_eq!(a.len(), 9);
+        assert_eq!(a.value_offsets()[9], 0i32);
+    }
+
+    #[test]
+    fn test_null_list_primitive() {
+        let data_type =
+            DataType::List(Box::new(Field::new("item", DataType::Int32, 
true)));

Review comment:
       There's a soundness hole here, that applies more for parquet. If I 
created this array as non-nullable, there wouldn't have been an error, even 
though this doesn't comply with the spec.
   
   @alamb @jorgecarleitao @dandandan should we panic, or should I change the 
signature to `Result<ArrayRef>` so I can validate this condition?
   
   I'm leaning to a panic, just like we panic on `impl From<ArrayDataRef> for 
{type}Array`. Wrapping a function that doesn't really error out, in a Result 
feels like assembing the Avengers for a single villain.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to