Re: [DatasourceV2] Allowing Partial Writes to DSV2 Tables

Wenchen Fan Wed, 13 May 2020 22:10:32 -0700

I think we already have this table capacity: ACCEPT_ANY_SCHEMA. Can you try
that?


On Thu, May 14, 2020 at 6:17 AM Russell Spitzer <russell.spit...@gmail.com>
wrote:

> I would really appreciate that, I'm probably going to just write a planner
> rule for now which matches up my table schema with the query output if they
> are valid, and fails analysis otherwise. This approach is how I got
> metadata columns in so I believe it would work for writing as well.
>
> On Wed, May 13, 2020 at 5:13 PM Ryan Blue <rb...@netflix.com> wrote:
>
>> I agree with adding a table capability for this. This is something that
>> we support in our Spark branch so that users can evolve tables without
>> breaking existing ETL jobs -- when you add an optional column, it shouldn't
>> fail the existing pipeline writing data to a table. I can contribute the
>> changes to validation if people are interested.
>>
>> On Wed, May 13, 2020 at 2:57 PM Russell Spitzer <
>> russell.spit...@gmail.com> wrote:
>>
>>> In DSV1 this was pretty easy to do because of the burden of verification
>>> for writes had to be in the datasource, the new setup makes partial writes
>>> difficult.
>>>
>>> resolveOuptutColumns checks the table schema against the writeplan's
>>> output and will fail any requests which don't contain every column as
>>> specified in the table schema.
>>> I would like it if instead if either we made this check optional for a
>>> datasource, perhaps an "allow partial writes" trait for the table? Or just
>>> allowed analysis
>>> to fail on "withInputDataSchema" where an implementer could throw
>>> exceptions on underspecified writes.
>>>
>>>
>>> The use case here is that C* (and many other sinks) have mandated
>>> columns that must be present during an insert as well as those
>>> which are not required.
>>>
>>> Please let me know if i've misread this,
>>>
>>> Thanks for your time again,
>>> Russ
>>>
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>

Re: [DatasourceV2] Allowing Partial Writes to DSV2 Tables

Reply via email to