keon94 commented on issue #3123: URL: https://github.com/apache/incubator-devlake/issues/3123#issuecomment-1289191947
Summary of impressions: Pros: 1. API calls are completely abstracted away, with no code needed from our side. We just have to study the streams defined in the catalog.json of the tap to see what's available to us, and that's straightforward. 2. Responses are captured in JSON schemas (in catalog.json) - so, using code-generation, we can have compile-time type safety, and eliminate the boilerplate of manually writing models for them 3. Built-in support for states to be able to resume Streams (where supported). Fully abstracted away from the developer. Cons: 1. Gives us less control over the API calls. Things like managing parallelism and rate-limiting are not within our control. I general, we don't have access to the query-params/headers that get sent. The "config.json" is the most we can control. 2. Quality of the Tap implementations can be questionable. See my comment above. Some are not well-maintained, or necessarily in sync with the most current API contracts. My conclusion: I think singer-plugins can be useful as long as the plugin doesn't have a need for the Collector layer. Plugins such as Jira do need the raw layer for things like custom fields and modifying transformation rules without re-collection. My current adaptation of the Singer-framework bypasses the Collector stage, and goes straight to Extract. If Collectors are absolutely necessary for all plugins, I don't think we're gaining anything with Singer - in fact, we lose, since we no longer have granular access to the API calls. If there can be more trivial plugins that can get away with no Collectors, and they come with well-maintained Singer-taps, I think the framework is useful for them. @klesh @Startrekzky @hezyin please share your thoughts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
