lidavidm commented on code in PR #858:
URL: https://github.com/apache/arrow-go/pull/858#discussion_r3449144366


##########
arrow/array/record.go:
##########
@@ -231,6 +231,28 @@ func (rec *simpleRecord) validate() error {
        return nil
 }
 
+// nullCountInRows returns the number of top-level nulls among the first nrows
+// rows of arr. It does not recurse into child arrays, so a nullable struct 
whose
+// non-nullable children carry nulls in null parent slots is not flagged; only 
a
+// field's own top-level nullability is enforced. Run-end-encoded logical 
nulls,
+// held in the values child rather than a top-level bitmap, are not counted.
+func nullCountInRows(arr arrow.Array, nrows int64) int {

Review Comment:
   (1) don't arrays cache their null count? can't we check that first as a fast 
path?
   (2) do we really care about the exact count? why do we need to iterate 
through the entire array?



##########
arrow/array/record.go:
##########
@@ -385,11 +407,26 @@ func (b *RecordBuilder) Resize(n int) {
 //
 // The returned RecordBatch must be Release()'d after use.
 //
-// NewRecordBatch panics if the fields' builder do not have the same length.
+// NewRecordBatch panics if the fields' builders do not have the same length 
or if
+// a non-nullable field contains nulls. Use NewRecordBatchChecked for an error.

Review Comment:
   While not strictly breaking, this might be a pretty painful change, if 
NewRecordBatch now "randomly" panics on things it would've accepted before; 
maybe we should deprecate this too as a stronger signal?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to