Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14638
  
    Thank you so much, @jamartinh , @srowen , @HyukjinKwon , and @gatorsmile .
    
    We can distinguish the two existing problems separately here.
    
    First, **a)** Spark returns incorrect result for an existing Hive table 
already with `skip.header.line.count` table property. This is the most common 
use case which this issue aimed to solve.
    
    Second, more ridiculously, **b)** Spark can create a table with 
`skip.header.line.count` table property and only Hive returns the correct 
result from that table.
    
    **SPARK (Current master branch)**
    ```scala
    scala> sql("CREATE TABLE t2 (id INT, value VARCHAR(10)) ROW FORMAT 
DELIMITED FIELDS TERMINATED BY ',' TBLPROPERTIES('skip.header.line.count'='1')")
    
    scala> sql("LOAD DATA LOCAL INPATH '/data/test.csv' OVERWRITE INTO TABLE 
t2")
    
    scala> sql("SELECT * FROM t2").show
    +----+-----+
    |  id|value|
    +----+-----+
    |null|   c2|
    |   1|    a|
    |   2|    b|
    +----+-----+
    ```
    **Hive**
    ```scala
    hive> select * from t2;
    OK
    1   a
    2   b
    ```
    
    @gatorsmile . I totally agree on the Apache Spark development direction.  
But, IMO, `TBLPROPERTIES` or `OPTION` is not a proper issue in this PR. It's 
because this PR only updates `TableReader.scala` to support the existing table 
property, case **a)**. For `TBLPROPERTIES`, I simply used that because it's 
already supported on Spark. I can update the PR description in order to focus 
on **a)** instead of **b)**.
    
    Someday later, Apache Spark may delete(or block) `TBLPROPERTIES` SQL syntax 
in favor of `OPTION` syntax. It's okay. It's just a kind of regression on 
purpose. No problem at all. However, even in that case, we had better read the 
Hive table with `skip.header.line.count` correctly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to