I think we already have this table capacity: ACCEPT_ANY_SCHEMA. Can you try that?
On Thu, May 14, 2020 at 6:17 AM Russell Spitzer <russell.spit...@gmail.com> wrote: > I would really appreciate that, I'm probably going to just write a planner > rule for now which matches up my table schema with the query output if they > are valid, and fails analysis otherwise. This approach is how I got > metadata columns in so I believe it would work for writing as well. > > On Wed, May 13, 2020 at 5:13 PM Ryan Blue <rb...@netflix.com> wrote: > >> I agree with adding a table capability for this. This is something that >> we support in our Spark branch so that users can evolve tables without >> breaking existing ETL jobs -- when you add an optional column, it shouldn't >> fail the existing pipeline writing data to a table. I can contribute the >> changes to validation if people are interested. >> >> On Wed, May 13, 2020 at 2:57 PM Russell Spitzer < >> russell.spit...@gmail.com> wrote: >> >>> In DSV1 this was pretty easy to do because of the burden of verification >>> for writes had to be in the datasource, the new setup makes partial writes >>> difficult. >>> >>> resolveOuptutColumns checks the table schema against the writeplan's >>> output and will fail any requests which don't contain every column as >>> specified in the table schema. >>> I would like it if instead if either we made this check optional for a >>> datasource, perhaps an "allow partial writes" trait for the table? Or just >>> allowed analysis >>> to fail on "withInputDataSchema" where an implementer could throw >>> exceptions on underspecified writes. >>> >>> >>> The use case here is that C* (and many other sinks) have mandated >>> columns that must be present during an insert as well as those >>> which are not required. >>> >>> Please let me know if i've misread this, >>> >>> Thanks for your time again, >>> Russ >>> >> >> >> -- >> Ryan Blue >> Software Engineer >> Netflix >> >