[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

HyukjinKwon Tue, 03 Apr 2018 07:08:12 -0700

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20959
  
    For usability, the workaround I suggested above has more flexibility. For 
example, we can make different operation (e.g, filter) on schema inference 
path. They are only few lines.
    
    Schema inference is discouraged in production line. I believe, for example, 
just taking 100 records and use the schema makes more sense.
    
    I am not against this option but I am saying I don't feel strong on this 
for above reasons.




---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

Reply via email to