Hi Chris,
  Thanks for this proposal! I think we have had quite a few issues with our
monolithic repository and I think it has hindered the development and
maintenance of new connectors.
  JB makes some good points that are worth considering.

  My 2c:
   I think separating out the connectors into a separate repo, and in fact
supporting multiple repos that can contain separate connectors is probably
going to be my vote.
   This will help us also clarify the "public API" of the Gobblin framework
versus internal details that many connectors probably depend on today.

 I would rather follow the Kafka Connect model of — core framework has
API-s and is versioned independently from connector implementations which
can live in other repositories. Implementations should feature in the
"Connector Matrix" as part of the documentation for discoverability.

There can be an official catalog of supported connectors, and maybe that
can be our first "repo" that Abhishek is proposing. But I would make sure
we are not creating a new monorepo pattern with it.

What do others think?
Shirshanka





On Mon, Mar 22, 2021 at 10:09 PM, Jean-Baptiste Onofre <[email protected]>
wrote:

> Hi Chris,
>
> I agree that connector is very important. Other Apache projects became
> popular mostly thank to the connectors set (I’m thinking about Apache Beam,
> Apache Camel, or Apache Karaf Decanter for instance). The connectors allow
> more users to "integrate" Gobblin in their ecosystem, so it would increase
> our users community. It will also increase our dev community as it’s
> probably easier to contribute on connector than in the Gobblin core.
>
> About the repo vs module, there are two questions IMHO:
> 1. How to keep API/code sync together between Gobblin core and the
> connectors
> 2. Do we plan to have a different release cycle between core and
> connectors (even if it’s always possible to release a module atomically)
>
> IMHO, if we plan to do a Gobblin release including core + connectors, then
> a module is easier.
>
> Regards
> JB
>
> Le 22 mars 2021 à 23:44, Chris Li <[email protected]> a écrit :
>
> Proposal:
>
> DIL (LinkedIn internal project name) is a generic multi-stage Gobblin
> connector library. The code can be accessed here: https://github.com/
> linkedin/gobblin-connectors. Its core features and high level
> descriptions are shared here: https://engineering.linkedin.com/blog/2021/
> data-integration-library.
>
> Per initial discussion with members of Gobblin community, we are here
> proposing a separate sub-repo for this library.
>
> Why:
> Some thoughts/justifications of a sub-repo vs. a module in the main
> Gobblin repo.
>
> 1. Gobblin connectors are important part of Gobblin ecosystem, but the
> development of connectors is relatively independent of Gobblin core.
> 2. Gobblin connector is where open source communities can contribute the
> most, and it will be growing much faster than Gobblin core.
> 3. The new connector library is a comprehensive package of unique design
> patterns. This is where the data integration diversity challenge will be
> addressed. The importance of this code base grows by day as more
> integration scenarios are becoming supported.
> 4. The new connector library evolves and replaces many prior Gobblin
> connectors under the “gobblin-modules” module. A separate repo will help
> avoid confusion.
> 5. Separating core and ecosystem modules can help improve isolation and
> reduce the number of defects.
>
> Regards,
> Chris
>
>

Reply via email to