Doug Rohrer created CASSANALYTICS-32:
----------------------------------------
Summary: Support writing to tables with constraints
Key: CASSANALYTICS-32
URL: https://issues.apache.org/jira/browse/CASSANALYTICS-32
Project: Apache Cassandra Analytics
Issue Type: New Feature
Components: Writer
Reporter: Doug Rohrer
Today, the underlying CQLSSTableWriter will throw an exception if we write data
which violates a constraint. If left in this state, we would end up failing a
Spark job because the task with the invalid data wouldn't be able to complete.
Decide on how we're going to handle these cases and implement the appropriate
logic in the bulk writer. This could be a writer option that lets the end-user
choose which option to take, and could include:
# Fail job (no code changes on the Bulk Writer)
# Skip rows that violate constraints and log - from experience, most folks
don't check logs for successful jobs, so this may eventually lead to issues
with users thinking they "lost" data when it was really just not writable -
this would mostly be adding a try/catch around the call to addRow and a log
line to log invalid data. We should absolutely not do this "by default" though
- it should be an opt-in feature that defaults to failing the job.
# Add a feature to the CQLSSTableWriter to disable constraints and allow
writing of data that would otherwise not meet the table's constraints. This
would allow otherwise-invalid data to be written and not fail the Spark job,
and should probably be logged as well.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]