Re: How should pipeline modeling work?

Patrick Wiener Fri, 26 Nov 2021 01:03:34 -0800

Hi all,

indeed this is something that can be a little frustrating. I agree with Philipp.

For exchanging intermediary processors:
Maybe for exchanging intermediary processors a kind of „replace“ feature can be 
beneficial. The replace can be added to the on mouseover options per pipeline 
element and simply
performs a  compatibility check. Then, throw „hints" similar to global pipeline 
hints (red triangle with „!“ ) but at pipeline element level of downstream 
pipeline elements. So it’s obvious to users that they need to reconfigure some 
downstream PE configuration where there is an obvious misalignment. I’d argue 
that reconfiguring these processors is far more convenient than disconnecting, 
deleting, selecting, connecting  and configuring the downstream PE once again. 
The latter is the current approach where a user has to rebuild the complete 
graph once again.

For exchanging adapters: 
This could be handled similarly to the processor replacement case. Assuming the 
development/test dataset source is the same as the live data source (on a 
schema level etc) there should be no
pipeline element level hints indicating any reconfiguration of downstream PE. 
So the question is whether we would then still need a kind of pipeline 
duplicate mechanism.

Looking at both scenarios this has other implications that generally resemble 
pipeline evolution cases in the sense of new pipeline versions along the 
application life cycle. So I ask myself if this is something that needs to be 
clarified prior to talking about how pipeline modeling should work. This can be 
a valuable extensions to the general pipeline overview to structure pipeline 
edits in an immutable/documented fashion. This also allows to rollback to 
previous versions including user-defined pipeline tags to organize their 
pipelines, e.g., „dev“, „prod“ or any other arbitrary tags. 

So my question on this is:
* Do we still want pipeline edits to update old pipeline graphs or create a new 
pipeline graph (current approach)? or
* Do we want pipeline edits to be stored and grouped as an „evolution“ of its 
previous pipeline version, like a pipeline version control?

Patrick

> Am 26.11.2021 um 09:22 schrieb Marco Heyden <[email protected]>:
> 
> Hi Dominik, hi Philipp,
> 
> I totally agree with you and had this Situation several times where I had to 
> rebuild the whole pipeline just to add another processor inbetween or to 
> change the data source.
> So in my opinion, one should be able to change pipeline elements, also if 
> they have predecessors and successors. I think that changes like that really 
> can enhance usability and adoption.
> 
> In the case of building the same pipeline for multiple data sources I find it 
> most practical to simply create a copy of an existing pipeline and adapt it, 
> e.g., by replacing the data source. Imo this is more lightweight than 
> implementing something like pipeline templates.
> 
> Best
> Marco 
> 
> -----Ursprüngliche Nachricht-----
> Von: Philipp Zehnder <[email protected]> 
> Gesendet: Friday, 26 November, 2021 08:21
> An: [email protected]
> Betreff: Re: How should pipeline modeling work?
> 
> Hi Dominik,
> 
> yes indeed this can be quite frustrating.
> I often have the case to build the same pipeline for multiple similar data 
> sources.
> Further, I sometimes build a pipeline on a pre-collected data set with the 
> data set adapter. Once I found the best parameters I apply this pipeline to 
> the streaming data source.
> This process is time-consuming and error-prone.
> 
> So I guess it would be good if we can exchange the data source, or change the 
> pipeline in the beginning.
> The main question is how flexible this should be? The reason for the current 
> approach is that the configuration of the downstream algorithms depend on the 
> configuration of the processing element, right?
> 
> So my questions would be:
> * Do we want a completely flexible solution where a user can change anything?
> * Should it be possible to replace data streams?
> 
> Philipp
> 
> On 2021/11/25 17:54:32 Dominik Riemer wrote:
>> Hi,
>> 
>> 
>> 
>> currently, when building pipelines, we use a rather strict validation 
>> approach where pipeline elements are added one-after-the-other and a 
>> new connection can only be added if the ancestor element is fully configured.
>> 
>> The drawback is that it is currently impossible to exchange 
>> intermediate pipeline elements (e.g., replace a stream while keeping 
>> the rest of the pipeline).
>> 
>> My feeling is that this can be frustrating and time-consuming for 
>> users in case only minor changes are applied to the pipeline structure.
>> 
>> 
>> 
>> What's your opinion on that? How should a perfect pipeline modeling 
>> process work from your point of view?
>> 
>> 
>> 
>> Dominik
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>

Re: How should pipeline modeling work?

Reply via email to