[ 
https://issues.apache.org/jira/browse/FLINK-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16334929#comment-16334929
 ] 

ASF GitHub Bot commented on FLINK-8240:
---------------------------------------

Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5240#discussion_r162947812
  
    --- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/api/BatchTableEnvironment.scala
 ---
    @@ -107,6 +111,16 @@ abstract class BatchTableEnvironment(
         }
       }
     
    +  /**
    +    * Creates a table from a descriptor that describes the resulting table 
schema, the source
    +    * connector, source encoding, and other properties.
    +    *
    +    * @param schema schema descriptor describing the table to create
    +    */
    +  def createTable(schema: Schema): BatchTableSourceDescriptor = {
    --- End diff --
    
    I'm not sure about the approach of returning a `TableSourceDescriptor`. I 
think it would be better if the table creation and registration would be 
completed within this method, i.e., the table should be completely defined by 
the argument of the method.
    
    For example
    
    ```
    tEnv.registerTableSource(
      "MyTable",
      TableSource.create(tEnv)
        .withSchema(
          Schema()
            .field(...)
            .field(...))
       .withConnector()
         ...
       .toTableSource()
      )
    ```
    
    In this design, we would reuse existing `registerTableSource` method and 
`TableSource.create` is a static method that returns a `TableSourceDescriptor`. 
Not sure if this is the best approach, but I like that the table is completely 
defined within the method call.


> Create unified interfaces to configure and instatiate TableSources
> ------------------------------------------------------------------
>
>                 Key: FLINK-8240
>                 URL: https://issues.apache.org/jira/browse/FLINK-8240
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API & SQL
>            Reporter: Timo Walther
>            Assignee: Timo Walther
>            Priority: Major
>
> At the moment every table source has different ways for configuration and 
> instantiation. Some table source are tailored to a specific encoding (e.g., 
> {{KafkaAvroTableSource}}, {{KafkaJsonTableSource}}) or only support one 
> encoding for reading (e.g., {{CsvTableSource}}). Each of them might implement 
> a builder or support table source converters for external catalogs.
> The table sources should have a unified interface for discovery, defining 
> common properties, and instantiation. The {{TableSourceConverters}} provide a 
> similar functionality but use an external catalog. We might generialize this 
> interface.
> In general a table source declaration depends on the following parts:
> {code}
> - Source
>   - Type (e.g. Kafka, Custom)
>   - Properties (e.g. topic, connection info)
> - Encoding
>   - Type (e.g. Avro, JSON, CSV)
>   - Schema (e.g. Avro class, JSON field names/types)
> - Rowtime descriptor/Proctime
>   - Watermark strategy and Watermark properties
>   - Time attribute info
> - Bucketization
> {code}
> This issue needs a design document before implementation. Any discussion is 
> very welcome.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to