Github user rdblue commented on the issue: https://github.com/apache/spark/pull/21306 @stczwd, I agree with @mccheah. Tables are basically named data sets. Whether they support batch, micro-batch streaming, or continuous streaming is determined by checking whether they implement SupportsBatchScan or similar interfaces. Matt's referenced docs are the right place to go for more context. The purpose here is to make catalogs and reads orthogonal. A catalog can return both batch-compatible and stream-compatible source "tables". A "table" may be a Kafka topic or may be a file-based data source. And note that both of those can support batch and streaming execution. A Kafka topic could be CDC stream that represents a table, and a file-based source could be streamed by periodically checking for new committed files. This PR is based on an [SPIP](https://docs.google.com/document/d/1zLFiA1VuaWeVxeTDXNg8bL6GP3BVoOZBkewFtEnjEoo/edit#heading=h.7vhjx9226jbt). That has some background for why I chose the set of table attributes here (schema, partitioning, properties), but a short summary is that those are the core set of attributes that are used in comparable SQL variants and already used in Spark.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org