[jira] [Created] (SPARK-18021) Refactor file name specification for data sources

Reynold Xin (JIRA) Wed, 19 Oct 2016 22:37:48 -0700

Reynold Xin created SPARK-18021:
-----------------------------------

             Summary: Refactor file name specification for data sources
                 Key: SPARK-18021
                 URL: https://issues.apache.org/jira/browse/SPARK-18021
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
            Reporter: Reynold Xin
            Assignee: Reynold Xin



Currently each data source OutputWriter is responsible for specifying the 
entire file name for each file output. This, however, does not make any sense 
because we rely on file name for certain behaviors in Spark SQL, e.g. bucket 
id. The current approach allows individual data sources to break the 
implementation of bucketing.

We don't want to move file name entirely also out of the data sources, because 
different data sources do want to specify different extensions.

A good compromise is for the OutputWriter to take in the prefix for a file, and 
it can add its own suffix.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (SPARK-18021) Refactor file name specification for data sources

Reply via email to