[
https://issues.apache.org/jira/browse/FLINK-12820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16865456#comment-16865456
]
Ozan Cicekci commented on FLINK-12820:
--------------------------------------
[~yunta] thanks for the comments! Regarding your questions,
Yes, not just scala tuple or case classes, flink built-in {{Tuple,}} or {{Row}}
also has these problems. Basically you can run into this issue in any data type
whose sink is implementing
[AbstractCassandraTupleSink|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/streaming/connectors/cassandra/AbstractCassandraTupleSink.java].
As far as I'm aware, only POJO sink allows you pass specific configurations to
ignore writing nulls, so you can avoid this issue rather easily in Java by
working with POJO's. I meant to emphasize scala part when describing the issue
with scala tuples, rather than the data type. Since POJOs are common in java,
it's a little hard to work around this issue in scala.
I only added tests for one data type since the source of the problem was at
AbstractCassandraTupleSink, but I can add more tests for different data types
if you think it'd be better.
Also, sorry for the unclear abbreviations! C* is short for Cassandra.
> Support ignoring null fields when writing to Cassandra
> ------------------------------------------------------
>
> Key: FLINK-12820
> URL: https://issues.apache.org/jira/browse/FLINK-12820
> Project: Flink
> Issue Type: Improvement
> Components: Connectors / Cassandra
> Affects Versions: 1.8.0
> Reporter: Ozan Cicekci
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Currently, records which have null fields are written to their corresponding
> columns in Cassandra as null. Writing null is basically a 'delete' for
> Cassandra, it's useful if nulls should correspond to deletes in the data
> model, but nulls can also indicate a missing data or partial column update.
> In that case, we end up overwriting columns of existing record on Cassandra
> with nulls.
>
> I believe it's already possible to ignore null values for POJO's with mapper
> options, as documented here:
> [https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/cassandra.html#cassandra-sink-example-for-streaming-pojo-data-type]
>
> But this is not possible when using scala tuples or case classes. Perhaps
> with a Cassandra sink configuration flag, null values can be unset using
> below option for tuples and case classes.
> [https://docs.datastax.com/en/drivers/java/3.0/com/datastax/driver/core/BoundStatement.html#unset-int-]
>
> Here is the equivalent configuration in spark-cassandra-connector;
> [https://github.com/datastax/spark-cassandra-connector/blob/master/doc/5_saving.md#globally-treating-all-nulls-as-unset]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)