Doug Rohrer created CASSANALYTICS-32:
----------------------------------------

             Summary: Support writing to tables with constraints
                 Key: CASSANALYTICS-32
                 URL: https://issues.apache.org/jira/browse/CASSANALYTICS-32
             Project: Apache Cassandra Analytics
          Issue Type: New Feature
          Components: Writer
            Reporter: Doug Rohrer


Today, the underlying CQLSSTableWriter will throw an exception if we write data 
which violates a constraint. If left in this state, we would end up failing a 
Spark job because the task with the invalid data wouldn't be able to complete.

Decide on how we're going to handle these cases and implement the appropriate 
logic in the bulk writer. This could be a writer option that lets the end-user 
choose which option to take, and could include:
 # Fail job (no code changes on the Bulk Writer)
 # Skip rows that violate constraints and log - from experience, most folks 
don't check logs for successful jobs, so this may eventually lead to issues 
with users thinking they "lost" data when it was really just not writable - 
this would mostly be adding a try/catch around the call to addRow and a log 
line to log invalid data. We should absolutely not do this "by default" though 
- it should be an opt-in feature that defaults to failing the job.
 # Add a feature to the CQLSSTableWriter to disable constraints and allow 
writing of data that would otherwise not meet the table's constraints. This 
would allow otherwise-invalid data to be written and not fail the Spark job, 
and should probably be logged as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to