[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-05-01 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418385072



##
File path: csharp/src/Apache.Arrow/Arrays/StringArray.cs
##
@@ -71,6 +76,15 @@ public string GetString(int index, Encoding encoding = 
default)
 
 var bytes = GetBytes(index);
 
+if (bytes == Span.Empty)
+{
+return null;
+}
+if (bytes.Length == 0)

Review comment:
   Yes. See 
`ArrayBuilderTests.StringArrayBuilderHandlesNullsAndEmptyStrings`. #Closed

##
File path: csharp/src/Apache.Arrow/Arrays/StringArray.cs
##
@@ -71,6 +76,15 @@ public string GetString(int index, Encoding encoding = 
default)
 
 var bytes = GetBytes(index);
 
+if (bytes == Span.Empty)

Review comment:
   I chose the first option as it avoids any public API changes. In general 
I am not a fan of `out` contracts, but did not have a better alternative. This 
way there is still time to contemplate alternatives before committing to the 
proposed overload. #Closed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-05-01 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418383314



##
File path: csharp/src/Apache.Arrow/Arrays/PrimitiveArrayBuilder.cs
##
@@ -162,8 +188,8 @@ public TBuilder Swap(int i, int j)
 public TArray Build(MemoryAllocator allocator = default)
 {
 return Build(
-ValueBuffer.Build(allocator), ArrowBuffer.Empty,
-ValueBuffer.Length, 0, 0);
+ValueBuffer.Build(allocator), 
ValidityBuffer.Build(allocator).ValueBuffer,

Review comment:
   As noted on another comment, I've pushed up this refactor. #Closed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-05-01 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418380212



##
File path: csharp/src/Apache.Arrow/Arrays/PrimitiveArrayBuilder.cs
##
@@ -99,55 +105,75 @@ public abstract class PrimitiveArrayBuilder : IArrowArrayBu
 {
 protected TBuilder Instance => this as TBuilder;
 protected ArrowBuffer.Builder ValueBuffer { get; }
+protected BooleanArray.Builder ValidityBuffer { get; }
 
 public int Length => ValueBuffer.Length;
-
-// TODO: Implement support for null values (null bitmaps)
+protected int NullCount { get; set; }
 
 internal PrimitiveArrayBuilder()
 {
 ValueBuffer = new ArrowBuffer.Builder();
+ValidityBuffer = new BooleanArray.Builder();
 }
 
 public TBuilder Resize(int length)
 {
 ValueBuffer.Resize(length);
+ValidityBuffer.Resize(length + 1);

Review comment:
   Copy/paste error from `BinaryArray`. Good catch.
   
   Fixed. #Closed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-05-01 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418424144



##
File path: csharp/src/Apache.Arrow/Arrays/ListArray.cs
##
@@ -135,6 +152,11 @@ public int GetValueOffset(int index)
 
 public int GetValueLength(int index)
 {
+if (IsNull(index))

Review comment:
   Fixed #Closed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-05-01 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r416049921



##
File path: csharp/src/Apache.Arrow/Arrays/ArrayData.cs
##
@@ -84,7 +84,7 @@ public ArrayData Slice(int offset, int length)
 length = Math.Min(Length - offset, length);
 offset += Offset;
 
-return new ArrayData(DataType, length, -1, offset, Buffers, 
Children);
+return new ArrayData(DataType, length, NullCount, offset, Buffers, 
Children);

Review comment:
   Appears obvious in retrospect. I reverted, though I did introduce a 
constant for enhanced readability. #Closed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-05-01 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418423816



##
File path: csharp/src/Apache.Arrow/Arrays/ArrayData.cs
##
@@ -22,6 +22,8 @@ namespace Apache.Arrow
 {
 public sealed class ArrayData : IDisposable
 {
+public static readonly int RecalculateNullCount = -1;

Review comment:
   Changed #Closed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418425841



##
File path: csharp/src/Apache.Arrow/Arrays/ArrayData.cs
##
@@ -53,6 +60,26 @@ public sealed class ArrayData : IDisposable
 Offset = offset;
 Buffers = buffers;
 Children = children;
+
+if (NullCount == RecalculateNullCount)
+{
+Recalculate();
+}
+}
+
+private int Recalculate()

Review comment:
   That's been pushed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418424732



##
File path: csharp/src/Apache.Arrow/Arrays/ArrayData.cs
##
@@ -53,6 +60,26 @@ public sealed class ArrayData : IDisposable
 Offset = offset;
 Buffers = buffers;
 Children = children;
+
+if (NullCount == RecalculateNullCount)
+{
+Recalculate();
+}
+}
+
+private int Recalculate()

Review comment:
   I've fixed in a branch but will revert in the PR.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418424144



##
File path: csharp/src/Apache.Arrow/Arrays/ListArray.cs
##
@@ -135,6 +152,11 @@ public int GetValueOffset(int index)
 
 public int GetValueLength(int index)
 {
+if (IsNull(index))

Review comment:
   Fixed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418423816



##
File path: csharp/src/Apache.Arrow/Arrays/ArrayData.cs
##
@@ -22,6 +22,8 @@ namespace Apache.Arrow
 {
 public sealed class ArrayData : IDisposable
 {
+public static readonly int RecalculateNullCount = -1;

Review comment:
   Changed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418423261



##
File path: csharp/src/Apache.Arrow/Arrays/ListArray.cs
##
@@ -69,25 +83,28 @@ public ListArray Build(MemoryAllocator allocator = default)
 
 return new ListArray(DataType, Length - 1,
 ValueOffsetsBufferBuilder.Build(allocator), 
ValueBuilder.Build(allocator),
-new ArrowBuffer(), 0, 0);
+ValidityBufferBuilder.Build(allocator).ValueBuffer, 
NullCount, 0);

Review comment:
   Fixed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418422210



##
File path: csharp/src/Apache.Arrow/Arrays/BooleanArray.cs
##
@@ -153,17 +184,25 @@ private void CheckIndex(int index)
 new[] { nullBitmapBuffer, valueBuffer }))
 { }
 
-public BooleanArray(ArrayData data) 
+public BooleanArray(ArrayData data)
 : base(data)
 {
 data.EnsureDataType(ArrowTypeId.Boolean);
 }
 
 public override void Accept(IArrowArrayVisitor visitor) => 
Accept(this, visitor);
 
+[Obsolete("GetBoolean does not support null values. Use GetValue 
instead (which this method invokes internally).")]
 public bool GetBoolean(int index)
 {
-return BitUtility.GetBit(Values, index);
+return GetValue(index).GetValueOrDefault();

Review comment:
   Changed in the test code uses of `GetValueOrDefault` to `Value`, but 
think this one should remain `GetValueOrDefault`. Existing code is likely to 
break otherwise. I've seen it already with the tests. Current code written 
against `GetBoolean` will not be expecting an exception, and it's easy enough 
for a null to get introduced somewhere else that eventually causes `GetBoolean` 
throw.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418419358



##
File path: csharp/src/Apache.Arrow/Arrays/BooleanArray.cs
##
@@ -153,17 +184,25 @@ private void CheckIndex(int index)
 new[] { nullBitmapBuffer, valueBuffer }))
 { }
 
-public BooleanArray(ArrayData data) 
+public BooleanArray(ArrayData data)
 : base(data)
 {
 data.EnsureDataType(ArrowTypeId.Boolean);
 }
 
 public override void Accept(IArrowArrayVisitor visitor) => 
Accept(this, visitor);
 
+[Obsolete("GetBoolean does not support null values. Use GetValue 
instead (which this method invokes internally).")]
 public bool GetBoolean(int index)
 {
-return BitUtility.GetBit(Values, index);
+return GetValue(index).GetValueOrDefault();

Review comment:
   Agree. Changed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418417744



##
File path: csharp/test/Apache.Arrow.Tests/BooleanArrayTests.cs
##
@@ -48,13 +48,13 @@ public void AppendsExpectedBit()
 .Append(false)
 .Build();
 
-Assert.False(array1.GetBoolean(0));
+Assert.False(array1.GetValue(0).GetValueOrDefault());

Review comment:
   With the new builder behavior of setting  NullCount to 0 and passing an 
empty buffer when null isn’t present, the Assert.Null calls in these tests fail 
because the IsValid check always returns true. These particular tests are not 
appending null values.
   
   I opted to move them to the new GetValue method as GetBoolean is deprecated.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418388163



##
File path: csharp/src/Apache.Arrow/Arrays/BooleanArray.cs
##
@@ -153,17 +182,25 @@ private void CheckIndex(int index)
 new[] { nullBitmapBuffer, valueBuffer }))
 { }
 
-public BooleanArray(ArrayData data) 
+public BooleanArray(ArrayData data)
 : base(data)
 {
 data.EnsureDataType(ArrowTypeId.Boolean);
 }
 
 public override void Accept(IArrowArrayVisitor visitor) => 
Accept(this, visitor);
 
+[Obsolete("GetBoolean does not support null values. Use GetValue 
instead (which this method invokes internally).")]
 public bool GetBoolean(int index)
 {
-return BitUtility.GetBit(Values, index);
+return GetValue(index).GetValueOrDefault();
+}
+
+public bool? GetValue(int index)
+{
+return BitUtility.GetBit(Nulls, Offset + index)

Review comment:
   There is a unit test, and it was passing; however, the line of code 
above is too simplistic and only handled the first byte/offsets [0..7]. The 
proper line of code would be:
   
   ```csharp
   return BitUtility.GetBit(Values, index + (Offset - (int)(Math.Floor(Offset / 
8.0) * 8.0) ));
   ```
   
   The addition of `Nulls` was an attempt to provide a convenience property 
similar to `Values`, but by also properly incorporating `Offset` in the 
`Span.Slice` range as it was previously ignored. I wanted to make have `Nulls` 
work this way in order to keep this more complicated code in one place.
   
   However, using `Nulls` and `Values` when working with a sliced `StringArray` 
makes life trickier for the caller, and perhaps resulted in minutely worse 
performance in common usage patterns, so I just removed `Nulls` and reverted 
the change to `Values` and now use the `NullBitmapBuffer.Span` directly.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418385072



##
File path: csharp/src/Apache.Arrow/Arrays/StringArray.cs
##
@@ -71,6 +76,15 @@ public string GetString(int index, Encoding encoding = 
default)
 
 var bytes = GetBytes(index);
 
+if (bytes == Span.Empty)
+{
+return null;
+}
+if (bytes.Length == 0)

Review comment:
   Yes. See 
`ArrayBuilderTests.StringArrayBuilderHandlesNullsAndEmptyStrings`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418384847



##
File path: csharp/src/Apache.Arrow/Arrays/StringArray.cs
##
@@ -71,6 +76,15 @@ public string GetString(int index, Encoding encoding = 
default)
 
 var bytes = GetBytes(index);
 
+if (bytes == Span.Empty)

Review comment:
   I chose the first option as it avoids any public API changes. In general 
I am not a fan of `out` contracts, but did not have a better alternative. This 
way there is still time to contemplate alternatives before committing to the 
proposed overload.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418383314



##
File path: csharp/src/Apache.Arrow/Arrays/PrimitiveArrayBuilder.cs
##
@@ -162,8 +188,8 @@ public TBuilder Swap(int i, int j)
 public TArray Build(MemoryAllocator allocator = default)
 {
 return Build(
-ValueBuffer.Build(allocator), ArrowBuffer.Empty,
-ValueBuffer.Length, 0, 0);
+ValueBuffer.Build(allocator), 
ValidityBuffer.Build(allocator).ValueBuffer,

Review comment:
   As noted on another comment, I've pushed up this refactor.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418382595



##
File path: csharp/src/Apache.Arrow/Arrays/PrimitiveArrayBuilder.cs
##
@@ -99,55 +105,75 @@ public abstract class PrimitiveArrayBuilder : IArrowArrayBu
 {
 protected TBuilder Instance => this as TBuilder;
 protected ArrowBuffer.Builder ValueBuffer { get; }
+protected BooleanArray.Builder ValidityBuffer { get; }
 
 public int Length => ValueBuffer.Length;
-
-// TODO: Implement support for null values (null bitmaps)
+protected int NullCount { get; set; }
 
 internal PrimitiveArrayBuilder()
 {
 ValueBuffer = new ArrowBuffer.Builder();
+ValidityBuffer = new BooleanArray.Builder();
 }
 
 public TBuilder Resize(int length)
 {
 ValueBuffer.Resize(length);
+ValidityBuffer.Resize(length + 1);
 return Instance;
 }
 
 public TBuilder Reserve(int capacity)
 {
 ValueBuffer.Reserve(capacity);
+ValidityBuffer.Reserve(capacity + 1);
 return Instance;
 }
 
 public TBuilder Append(T value)
 {
 ValueBuffer.Append(value);
+ValidityBuffer.Append(true);
 return Instance;
 }
 
 public TBuilder Append(ReadOnlySpan span)
 {
+var len = ValueBuffer.Length;
 ValueBuffer.Append(span);
+ValidityBuffer.AppendRange(Enumerable.Repeat(true, 
ValueBuffer.Length - len));
 return Instance;
 }
 
 public TBuilder AppendRange(IEnumerable values)
 {
+var len = ValueBuffer.Length;
 ValueBuffer.AppendRange(values);
+ValidityBuffer.AppendRange(Enumerable.Repeat(true, 
ValueBuffer.Length - len));
+return Instance;
+}
+
+public TBuilder AppendNull()
+{
+ValidityBuffer.Append(false);
+NullCount++;
+// Need this until this is refactored to use

Review comment:
   `PrimitiveArrayBuilder` sidesteps the use of an offset buffer. In doing 
so, we end up having to take-up space in the `ValueBuffer`, even when we have 
`null` values, in order to keep `Length` correct, since:
   
   ```csharp
   public int Length => ValueBuffer.Length;
   ```
   This comment was perhaps an overly-terse way to note this. My assumption was 
that this builder would eventually be refactored to use an offset buffer as 
seen in `BuilderBase`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418382595



##
File path: csharp/src/Apache.Arrow/Arrays/PrimitiveArrayBuilder.cs
##
@@ -99,55 +105,75 @@ public abstract class PrimitiveArrayBuilder : IArrowArrayBu
 {
 protected TBuilder Instance => this as TBuilder;
 protected ArrowBuffer.Builder ValueBuffer { get; }
+protected BooleanArray.Builder ValidityBuffer { get; }
 
 public int Length => ValueBuffer.Length;
-
-// TODO: Implement support for null values (null bitmaps)
+protected int NullCount { get; set; }
 
 internal PrimitiveArrayBuilder()
 {
 ValueBuffer = new ArrowBuffer.Builder();
+ValidityBuffer = new BooleanArray.Builder();
 }
 
 public TBuilder Resize(int length)
 {
 ValueBuffer.Resize(length);
+ValidityBuffer.Resize(length + 1);
 return Instance;
 }
 
 public TBuilder Reserve(int capacity)
 {
 ValueBuffer.Reserve(capacity);
+ValidityBuffer.Reserve(capacity + 1);
 return Instance;
 }
 
 public TBuilder Append(T value)
 {
 ValueBuffer.Append(value);
+ValidityBuffer.Append(true);
 return Instance;
 }
 
 public TBuilder Append(ReadOnlySpan span)
 {
+var len = ValueBuffer.Length;
 ValueBuffer.Append(span);
+ValidityBuffer.AppendRange(Enumerable.Repeat(true, 
ValueBuffer.Length - len));
 return Instance;
 }
 
 public TBuilder AppendRange(IEnumerable values)
 {
+var len = ValueBuffer.Length;
 ValueBuffer.AppendRange(values);
+ValidityBuffer.AppendRange(Enumerable.Repeat(true, 
ValueBuffer.Length - len));
+return Instance;
+}
+
+public TBuilder AppendNull()
+{
+ValidityBuffer.Append(false);
+NullCount++;
+// Need this until this is refactored to use

Review comment:
   `PrimitiveArrayBuilder` sidesteps the offset buffer entirely. In doing 
so, we end up having to take-up space in the `ValueBuffer`, even when we have 
`null` values, in order to keep `Length` correct, since:
   
   ```csharp
   public int Length => ValueBuffer.Length;
   ```
   This comment was perhaps an overly-terse was to note this. My assumption was 
that this builder would eventually be refactored to use an offset buffer as 
seen in `BuilderBase`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418382595



##
File path: csharp/src/Apache.Arrow/Arrays/PrimitiveArrayBuilder.cs
##
@@ -99,55 +105,75 @@ public abstract class PrimitiveArrayBuilder : IArrowArrayBu
 {
 protected TBuilder Instance => this as TBuilder;
 protected ArrowBuffer.Builder ValueBuffer { get; }
+protected BooleanArray.Builder ValidityBuffer { get; }
 
 public int Length => ValueBuffer.Length;
-
-// TODO: Implement support for null values (null bitmaps)
+protected int NullCount { get; set; }
 
 internal PrimitiveArrayBuilder()
 {
 ValueBuffer = new ArrowBuffer.Builder();
+ValidityBuffer = new BooleanArray.Builder();
 }
 
 public TBuilder Resize(int length)
 {
 ValueBuffer.Resize(length);
+ValidityBuffer.Resize(length + 1);
 return Instance;
 }
 
 public TBuilder Reserve(int capacity)
 {
 ValueBuffer.Reserve(capacity);
+ValidityBuffer.Reserve(capacity + 1);
 return Instance;
 }
 
 public TBuilder Append(T value)
 {
 ValueBuffer.Append(value);
+ValidityBuffer.Append(true);
 return Instance;
 }
 
 public TBuilder Append(ReadOnlySpan span)
 {
+var len = ValueBuffer.Length;
 ValueBuffer.Append(span);
+ValidityBuffer.AppendRange(Enumerable.Repeat(true, 
ValueBuffer.Length - len));
 return Instance;
 }
 
 public TBuilder AppendRange(IEnumerable values)
 {
+var len = ValueBuffer.Length;
 ValueBuffer.AppendRange(values);
+ValidityBuffer.AppendRange(Enumerable.Repeat(true, 
ValueBuffer.Length - len));
+return Instance;
+}
+
+public TBuilder AppendNull()
+{
+ValidityBuffer.Append(false);
+NullCount++;
+// Need this until this is refactored to use

Review comment:
   `PrimitiveArrayBuilder` sidesteps the use of an offset buffer. In doing 
so, we end up having to take-up space in the `ValueBuffer`, even when we have 
`null` values, in order to keep `Length` correct, since:
   
   ```csharp
   public int Length => ValueBuffer.Length;
   ```
   This comment was perhaps an overly-terse was to note this. My assumption was 
that this builder would eventually be refactored to use an offset buffer as 
seen in `BuilderBase`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418380212



##
File path: csharp/src/Apache.Arrow/Arrays/PrimitiveArrayBuilder.cs
##
@@ -99,55 +105,75 @@ public abstract class PrimitiveArrayBuilder : IArrowArrayBu
 {
 protected TBuilder Instance => this as TBuilder;
 protected ArrowBuffer.Builder ValueBuffer { get; }
+protected BooleanArray.Builder ValidityBuffer { get; }
 
 public int Length => ValueBuffer.Length;
-
-// TODO: Implement support for null values (null bitmaps)
+protected int NullCount { get; set; }
 
 internal PrimitiveArrayBuilder()
 {
 ValueBuffer = new ArrowBuffer.Builder();
+ValidityBuffer = new BooleanArray.Builder();
 }
 
 public TBuilder Resize(int length)
 {
 ValueBuffer.Resize(length);
+ValidityBuffer.Resize(length + 1);

Review comment:
   Copy/paste error from `BinaryArray`. Good catch.
   
   Fixed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox


zgramana commented on a change in pull request #7032:
URL: https://github.com/apache/arrow/pull/7032#discussion_r418162946



##
File path: csharp/src/Apache.Arrow/Arrays/BinaryArray.cs
##
@@ -111,23 +130,32 @@ public TBuilder AppendRange(IEnumerable values)
 public TBuilder AppendRange(IEnumerable values)
 {
 var len = ValueBuffer.Length;
-ValueOffsets.Append(Offset);
 ValueBuffer.AppendRange(values);
-Offset += ValueBuffer.Length - len;
+var valOffset = ValueBuffer.Length - len;
+if (valOffset == 0)

Review comment:
   Fixed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org