Eric Erhardt created ARROW-5887:
-----------------------------------

             Summary: [C#] ArrowStreamWriter writes FieldNodes in wrong order
                 Key: ARROW-5887
                 URL: https://issues.apache.org/jira/browse/ARROW-5887
             Project: Apache Arrow
          Issue Type: Bug
          Components: C#
            Reporter: Eric Erhardt
            Assignee: Eric Erhardt


When ArrowStreamWriter is writing a {{RecordBatch}} with {{null}}s in it, it is 
mixing up the column's {{NullCount}}.

You can see here:

[https://github.com/apache/arrow/blob/90affbd2c41e80aa8c3fac1e4dbff60aafb415d3/csharp/src/Apache.Arrow/Ipc/ArrowStreamWriter.cs#L195-L200]

It is writing the fields from {{0}} -> {{fieldCount}} order. But then 
[lower|https://github.com/apache/arrow/blob/90affbd2c41e80aa8c3fac1e4dbff60aafb415d3/csharp/src/Apache.Arrow/Ipc/ArrowStreamWriter.cs#L216-L220],
 it is writing the fields from {{fieldCount}} -> {{0}}.

Looking at the [Java 
implementation|https://github.com/apache/arrow/blob/7b2d68570b4336308c52081a0349675e488caf11/java/vector/src/main/java/org/apache/arrow/vector/ipc/message/FBSerializables.java#L36-L44]
 it says
{quote}// struct vectors have to be created in reverse order
{quote}
 

A simple test of roundtripping the following RecordBatch shows the issue:

 
{code:java}
var result = new RecordBatch(
new Schema.Builder()
.Field(f => f.Name("age").DataType(Int32Type.Default))
.Field(f => f.Name("CharCount").DataType(Int32Type.Default))
.Build(),
new IArrowArray[]
{
new Int32Array(
new ArrowBuffer.Builder<int>().Append(0).Build(),
new ArrowBuffer.Builder<byte>().Append(0).Build(),
length: 1,
nullCount: 1,
offset: 0),
new Int32Array(
new ArrowBuffer.Builder<int>().Append(7).Build(),
ArrowBuffer.Empty,
length: 1,
nullCount: 0,
offset: 0)
},
length: 1);
{code}
Here, the "age" column should have a `null` in it. However, when you write and 
read this RecordBatch back, you see that the "CharCount" column has `NullCount` 
== 1 and "age" column has `NullCount` == 0.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to