Thank you for pointing it out. The awareness problem fits me well here - I have 
a good lesson to discuss things on the devlist.

About SolrIO - I'll create a thread on @users to discuss which versions should 
be supported and make relevant changes after getting a conclusion.

On 2020/10/22 14:24:45, Ismaël Mejía <[email protected]> wrote: 
> I have seen ongoing work on upgrading dependencies, this is a great task 
> needed
> for the health of the project and its IO connectors, however I am a bit 
> worried
> on the impact of these on existing users. We should be aware that we support 
> old
> versions of the clients for valid reasons. If we update a version of a client 
> we
> should ensure that it still interacts correctly with existing users and 
> runtime
> systems. Basically we need two conditions:
> 
> 1. We cannot update dependencies without considering the current use of them.
> 2. We must avoid upgrading to a non-stable or non-LTS dependency version
> 
> For (1) in a recent thread Piotr brang some issues about updating Hadoop
> dependencies to version 3. This surprised me because the whole Big Data
> ecosystem is just catching up with Hadoop 3  (Flink does not even release
> artifacts for this yet, and Spark just started on version 3 some months ago),
> which means that most of our users still need that we guarantee compatiblity
> with Hadoop 2.x dependencies.
> 
> The Hadoop dependencies are mostly 'provided' so a way to achieve this is by
> creating new test configurations that guarantees backwards (or forwards)
> compatibility by providing the respective versions. This is similar to what we
> do currently in KafkaIO by using by default version 1.0.0 but testing
> compatibility with 2.1.0 by providing the right dependencies too.
> 
> The same thread discusses also upgrading to version 3.3.x the latest, but per
> (2) we should not consider upgrades to non stable versions which of Hadoop  is
> currently 3.2.1.  https://hadoop.apache.org/docs/stable/
> 
> I also saw a recent upgrade of SolrIO to version 8 which may affect some users
> of previous versions with no discussion about it on the mailing lists and no
> backwards compatibility guarantees.
> https://github.com/apache/beam/pull/13027
> 
> In the Solr case I think probably this update makes more sense since Solr 5.x
> is deprecated and less people would be probably impacted but still it would
> have been good to discuss this on user@
> 
> I don't know how we can find a good equilibrium between deciding on those
> upgrades from maintainers vs users without adding much overhead. Should we 
> have
> a VOTE maybe for the most sensible dependencies? or just assume this is a
> criteria for the maintainers, I am afraid we may end up with
> incompatible changes
> due to the lack of awareness or for not much in return but at the same
> time I wonder if it makes sense to add the extra work of discussion
> for minor dependencies where this matters less.
> 
> Should we document maybe the sensible dependency upgrades (the recent
> thread on Avro upgrade comes to my mind too)? Or should we have the same
> criteria for all.  Other ideas?
> 

Reply via email to