[GitHub] spark issue #20387: [SPARK-23203][SPARK-23204][SQL]: DataSourceV2: Use immut...

cloud-fan Mon, 29 Jan 2018 19:19:04 -0800

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20387
  
    > This is a new API...
    
    Are you saying you wanna add a new method in `DataFreameReader` that is 
different than `load`? In Scala, parameter name is part of the method 
signature, so for `def load(path: String)`, we can't change its semantic, the 
parameter is a path. It's fine if a data source impelementation teach its users 
that path will interpreted as database/tables by it, but this should not be a 
contract in Spark.
    
    I do agree that Spark should set a standard for specifying database and 
table, as it's very common. We can even argue that path is not a general 
concept for data sources, but we still provide special APIs for path.
    
    My proposal: How about we add a new methods `table` in `DataFrameReader`? 
The usage would look like: 
`spark.read.format("iceberg").table("db.table").load()`, what do you think? We 
should not specify `database`, as if we may have catalog federation and table 
name may have 3 parts `catalog.db.table`. Let's keep it general and let the 
data source to interprete it.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #20387: [SPARK-23203][SPARK-23204][SQL]: DataSourceV2: Use immut...

Reply via email to