Mike Pedersen created BEAM-6607:
-----------------------------------

             Summary: SchemaCoder cannot encode row with null value in array
                 Key: BEAM-6607
                 URL: https://issues.apache.org/jira/browse/BEAM-6607
             Project: Beam
          Issue Type: Bug
          Components: sdk-java-core
            Reporter: Mike Pedersen
            Assignee: Kenneth Knowles


This example fails with a "cannot encode null Integer" exception:
{code:java}
import com.google.common.io.ByteStreams;
import org.apache.beam.sdk.schemas.Schema;
import org.apache.beam.sdk.schemas.SchemaCoder;
import org.apache.beam.sdk.values.Row;
import java.io.IOException;
import java.util.Collections;

public class Main {
    public static void main(String[] args) throws IOException {
        Schema schema = Schema.builder()
                .addField("a", Schema.FieldType.array(Schema.FieldType.INT32, 
true))
                .build();

        Row row = 
Row.withSchema(schema).addValue(Collections.singletonList(null)).build();

        SchemaCoder.of(schema).encode(row, ByteStreams.nullOutputStream());
    }
}{code}
Note that null in the array should be OK, as the nullable parameter to 
Schema.FieldType.array is true.

An easy way of solving this could be to wrap inner coders with a NullableCoder, 
but a better way would probably to have something like a NullableIterableCoder 
that uses a bitset similarly to how the SchemaCoder encodes nullable fields.

I'll probably take a stab at fixing and making a pull request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to