+1 for the idea to decouple backend-specific from common java/cpp code! It's a very worthy thing and will greatly improve code readability and will help a lot when introducing other backend or supporting other engines e.g. flink. And indeed it'll cost a lot of effort to do better common abstraction and api design! Thanks for the proposal and I'm interested in this topic!
On Fri, Nov 8, 2024 at 4:10 PM Hongze Zhang <[email protected]> wrote: > For people who are interested in the topic, there is a mirrored GitHub > discussion here[1] and we already got some good ideas. Thanks for > everyone's help. > > Best, > Hongze > > [1] https://github.com/apache/incubator-gluten/discussions/7735 > > On Wed, Oct 30, 2024 at 2:01 PM Hongze Zhang <[email protected]> wrote: > > > > Hi folks, > > > > In our code base, there are still some contents in the common module > > but actually belong to one certain backend (CH or VL). Do you think we > > can do something to get rid of them? > > > > The contents I found: > > > > 1. Backend-specific configurations in GlutenConfig.scala > > 2. Backend-specific code in UT module, E.g., the calls to > > BackendTestUtils.is<name>BackendLoaded > > > > The similar issue also apply to our source directory tree, for example: > > > > 1. `gluten-ut` module has spark3x/src/test/backends-clickhouse and > > spark3x/src/test/backends-velox at the same time > > 2. `cpp` (velox cpp code dir) / `cpp-ch` are in root directory > > 3. Ecosystem supports, e.g., `gluten-uniffle` also has a `velox` > subfolder > > 4. Documentations > > > > For a long time my opinion is we should finally get most of these > > contents moved to their own modules. For example, UT customizations > > can be placed in `backend-clickhouse` and `backend-velox`. > > Configurations can be placed in `VeloxConfig` / `CHConfig` or > > something, like I mentioned in another issue[1], etc. > > > > Having said that, I think there should be some contents that are > > unnecessary to move around so they could remain in common modules, for > > example, the backend GHA CI scripts and backend documentations. Others > > look to be reasonable to move but considerable efforts of refactors > > will be needed. > > > > In future ,I think I could continue taking the majority of this work > > but help will be needed when it comes to the CH backend or to the > > ecosystem code (rss, data lake). > > > > Any thoughts will be appreciated. > > > > Thanks, > > Hongze > > > > [1] https://github.com/apache/incubator-gluten/issues/6970 > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > -- Best Regards, Terry Wang
