StephanEwen commented on pull request #13512:
URL: https://github.com/apache/flink/pull/13512#issuecomment-702303125


   No worries, let's figure this out together.
   
   ### About the flink-core package structure.
   
   Agree, `flink-core`'s package structure is wild.
   
   Most of it is because of the pre-streaming origin, and there was not too 
much thought there. `api.common` was shared between Batch Java and Batch Scala 
APIs. `api.java` was originally in a different project (Batch Java API) and 
moved to core when streaming came into the mix to be shared with the Streaming 
Java API.
   
   `core.io` was IIRC IO-related stuff that was not part of the API, but 
indirectly relevant in core API classes, like the utils around the managed 
memory abstraction. `api.common.io` was for API-related I/O, like sources, 
sinks.
   
   A lot of stuff was also added over time in the wrong places, admittedly, 
because few people in the early days paid attention to package structure. The 
packages stuck because we didn't want to break user applications. For Flink 
2.0, we should try and simplify this.
   
   tl;dr - it is the way it is for historic reasons and because the "not 
breaking application" consideration ended up trumping the "clean up packages" 
consideration. Seems the structure was not getting too much in the way of users 
after all.
   
   ### "flink-connector-base" versus "flink-core"
   
   What I would like to keep is that this source implementation stays in 
`flink-core` and does not move to another module. We have a lot of dependencies 
already and it really doesn't hurt in my opinion to have that implementation 
there in the core (given that it is lightweight and has no dependencies).
   
   The Split-Reader API could be moved to flink-core, theoretically. I didn't 
argue for that in the original PR, because it is not used for simple sources 
(like here), but typically for sources with external systems that need separate 
I/O threads. These sources would be in separate modules anyways. So having the 
Split Reader API in a separate module would not make any difference to users, 
so I just went with your suggestion there (if it makes no difference, I 
typically go with what the code author suggests). IT has the slight advantage 
even. of being more flexible with test dependencies, so Integration Tests can 
be in the module directly.
   
   ### Packages in this PR
   
   About the `lib` and `util` package names, I am not opinionated on those. In 
fact, I don't particularly like `lib` either.
   Anything that is reasonably "discoverable" for users would work here from my 
side. Let's try to up with different names or locations, I am definitely open 
to suggestions.
   Maybe we'll move this in a follow-up effort, to help unblock some features 
where I wanted to use this source in the tests.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to