Thank you for pointing it out. The awareness problem fits me well here - I have a good lesson to discuss things on the devlist.
About SolrIO - I'll create a thread on @users to discuss which versions should be supported and make relevant changes after getting a conclusion. On 2020/10/22 14:24:45, Ismaël Mejía <[email protected]> wrote: > I have seen ongoing work on upgrading dependencies, this is a great task > needed > for the health of the project and its IO connectors, however I am a bit > worried > on the impact of these on existing users. We should be aware that we support > old > versions of the clients for valid reasons. If we update a version of a client > we > should ensure that it still interacts correctly with existing users and > runtime > systems. Basically we need two conditions: > > 1. We cannot update dependencies without considering the current use of them. > 2. We must avoid upgrading to a non-stable or non-LTS dependency version > > For (1) in a recent thread Piotr brang some issues about updating Hadoop > dependencies to version 3. This surprised me because the whole Big Data > ecosystem is just catching up with Hadoop 3 (Flink does not even release > artifacts for this yet, and Spark just started on version 3 some months ago), > which means that most of our users still need that we guarantee compatiblity > with Hadoop 2.x dependencies. > > The Hadoop dependencies are mostly 'provided' so a way to achieve this is by > creating new test configurations that guarantees backwards (or forwards) > compatibility by providing the respective versions. This is similar to what we > do currently in KafkaIO by using by default version 1.0.0 but testing > compatibility with 2.1.0 by providing the right dependencies too. > > The same thread discusses also upgrading to version 3.3.x the latest, but per > (2) we should not consider upgrades to non stable versions which of Hadoop is > currently 3.2.1. https://hadoop.apache.org/docs/stable/ > > I also saw a recent upgrade of SolrIO to version 8 which may affect some users > of previous versions with no discussion about it on the mailing lists and no > backwards compatibility guarantees. > https://github.com/apache/beam/pull/13027 > > In the Solr case I think probably this update makes more sense since Solr 5.x > is deprecated and less people would be probably impacted but still it would > have been good to discuss this on user@ > > I don't know how we can find a good equilibrium between deciding on those > upgrades from maintainers vs users without adding much overhead. Should we > have > a VOTE maybe for the most sensible dependencies? or just assume this is a > criteria for the maintainers, I am afraid we may end up with > incompatible changes > due to the lack of awareness or for not much in return but at the same > time I wonder if it makes sense to add the extra work of discussion > for minor dependencies where this matters less. > > Should we document maybe the sensible dependency upgrades (the recent > thread on Avro upgrade comes to my mind too)? Or should we have the same > criteria for all. Other ideas? >
