I think we thought about this approach but decided to get rid of the map
representation wherever we can while still supporting setting of the
options by name.

One of the lesser important downsides of keeping the map around is that we
will need to do `fromArgs` at least twice.

Another downside is that we will probably have to keep and maintain two
representations of the pipeline options at the same time and have extra
validations and probably reconciliation logic.

We need the map representation in the JDBC/command-line use case where it's
the only way for a user to specify the options. A user runs a special SQL
command which goes through normal parsing and execution logic. On top of
that we have a case of mixed Java/SQL pipelines, where we already have an
instance of PipelineOptions and don't need a user to set the options from
within a query. Right now this is impossible for other reasons as well. But
to support both JDBC and Java+SQL use cases we currently pass both a map
and a PipelineOptions instance around. Which makes things confusing. We can
probably reduce passing things around but I think we will still need to
keep both representations.

Ideally, I think, mixed Java+SQL pipelines should be backed by that same
JDBC logic as much as possible. So potentially we should allow users to set
the pipeline options from within a complicated query even in SqlTransform
in a Java pipeline. However setting an option from within SQL persists it
in the map, but in mixed case we already have the PipelineOption instance
that we got from the SqlTransform. So now we will need to maintain the
logic to reconcile the two representations. That will probably involve
either something similar to the proposed reflection approach, or
serializing both representations to a map or JSON and then reconciling and
then reconstructing it from there. This sounds unnecessary and we can avoid
this if we are able to just set the pipeline options by name in the first
place. In that case we can just use whatever PipelineOptions instance we
have at the moment without extra validation / reconciliation.

Hope this makes sense.

Regards,
Anton

On Thu, Jun 27, 2019 at 4:38 PM Lukasz Cwik <lc...@google.com> wrote:

> Not sure, based upon the JIRA description it seems like you want early
> validation of PipelineOptions. Couldn't you maintain the map of pipeline
> options and every time one is added call PipelineOptionsFactory.fromArgs
> discarding the result just for the error checking?
>
> On Tue, Jun 25, 2019 at 10:12 AM Alireza Samadian <asamad...@google.com>
> wrote:
>
>> Not sure. One solution might be moving the
>> PipelineOptionsReflectionSetter class to SQL package and make it package
>> private. This will prevent the exposure but the downside would be I need to
>> make PipelineOptionsFactory.parseObjects() public or duplicate its code. Do
>> you think this approach might be better? I would also appreciate if you
>> have another suggestion to solve this.
>>
>> Best,
>> Alireza
>>
>> On Tue, Jun 25, 2019 at 8:40 AM Lukasz Cwik <lc...@google.com> wrote:
>>
>>> That makes sense. I took a look at your PR, is there a way to do it
>>> without exposing the reflection capabilities to pipeline authors?
>>>
>>> On Mon, Jun 24, 2019 at 2:20 PM Alireza Samadian <asamad...@google.com>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I am writing to ask if it is OK to slightly change the behaviour of SET
>>>> command in JDBC connection of Beam SQL. Currently, if we try to use set
>>>> command for an option that does not exist or setting an option to an
>>>> illegal value, it does not show any error until we run a query. This means
>>>> one can potentially set it incorrectly and then reset it correctly and run
>>>> query without getting any error. However, I want to make some changes in
>>>> JDBC Driver that causes this behavior to be changed. After this change, if
>>>> someone tries to use set command for a wrong pipeline option (in JDBC
>>>> path), it will immediately see an error message.
>>>>
>>>> The reason for this change is because I am working on the Jira issue
>>>> https://issues.apache.org/jira/projects/BEAM/issues/BEAM-7590, and I
>>>> am removing the Pipeline Option Map representation and keep the actual
>>>> pipeline options instead. As a result, each time that the set command is
>>>> called, it will try to change the pipeline options instance using
>>>> reflection instead of changing a map, and later constructing pipeline
>>>> options from it.
>>>>
>>>> The following is a link to the pull request:
>>>> https://github.com/apache/beam/pull/8928
>>>>
>>>> Best,
>>>> Alireza Samadian
>>>>
>>>

Reply via email to