Hi all,

Is the behaviour of pa.Field.nullable documented somewhere?

I had some expectations of what it does. For example it should make sure
that you can't have null/missing value in a column that is declared with
nullable=False. But it doesn't seem to be the case.

```
import pyarrow as pa

schema = pa.schema(
    [
        pa.field("nullable_true", pa.string(), nullable=True),
        pa.field("nullable_false", pa.string(), nullable=False),
    ]
)

table = pa.Table.from_arrays(
    [
        pa.array(["", "foo", None], pa.string()),
        pa.array(["", "foo", None], pa.string()),
    ],
    schema=schema,
)

assert table.schema == schema
assert table['nullable_true'].null_count == 1
assert table['nullable_false'].null_count == 1
assert table.validate() is None
assert table.validate(full=True) is None
```

The only place where I've seen the nullable flag being used is when casting
nested column from nullable to non-nullable:

```
import pyarrow as pa

struct_array = pa.StructArray.from_arrays(
    [
        pa.array(["", "foo", None], pa.string()),
    ],
    names=["nested_col_level_1"],
)
nested_table = pa.Table.from_arrays([struct_array],
names=["nested_col_level_0"])
assert nested_table.validate(full=True) is None
assert nested_table.validate() is None

nested_table.cast(
    pa.schema(
        [
            pa.field(
                "nested_col_level_0",
                pa.struct(
                    [pa.field("nested_col_level_1", pa.string(),
nullable=False)]
                ),
            )
        ]
    )
)
```

Thanks for your help!

Reply via email to