Qiegang Long created SPARK-55568:
------------------------------------

             Summary: Separate schema construction from field statistics 
collection
                 Key: SPARK-55568
                 URL: https://issues.apache.org/jira/browse/SPARK-55568
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 4.1.1, 4.1.0
            Reporter: Qiegang Long


Variant shredding schema inference is expensive and can take well over 100ms 
per file. Propose an optimization to separate field stats collection and schema 
construction to eliminate repeated schema merge and intermediate allocations



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to