right now, we have two separate Maven modules for batch and streaming
connectors (flink-batch-connectors and flink-streaming-connectors) that
contain modules for the individual external systems and storage formats
such as HBase, Cassandra, Avro, Elasticsearch, etc.
Some of these systems can be used in streaming as well as batch jobs as for
instance HBase, Cassandra, and Elasticsearch. However, due to the separate
main modules for streaming and batch connectors, we currently need to
decide where to put a connector. For example, the flink-connector-cassandra
module is located in flink-streaming-connectors but includes a
CassandraInputFormat and CassandraOutputFormat (i.e., a batch source and
In my opinion, it would be better to just merge flink-batch-connectors and
flink-streaming-connectors into a joint flink-connectors module.
This would be only an internal restructuring of code and not be visible to
users (unless we change the module names of the individual connectors which
is not necessary, IMO).
What do others think?