[
https://issues.apache.org/jira/browse/SPARK-12232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15048189#comment-15048189
]
Yanbo Liang commented on SPARK-12232:
-------------------------------------
I vote for do not expose read.table because it has different semantics compared
with base R and other read.*** functions.
In function "SQLContext.read.table(tableName: String)", users load a table as a
DataFrame by specifying the tableName, but the table metadata must already
exist in the catalog such as "HiveMetastoreCatalog". It means users can not use
"read.table()" to load an external data source as a DataFrame if it does not
have metadata stored at Spark catalog, user must know the file format and use
corresponding function such as "read.json".
The read.table interface mainly used to access a table which has already loaded
into Spark as RDD at Spark SQL side, so I think it's unnecessary for SparkR.
> Consider exporting read.table in R
> ----------------------------------
>
> Key: SPARK-12232
> URL: https://issues.apache.org/jira/browse/SPARK-12232
> Project: Spark
> Issue Type: Bug
> Components: SparkR
> Affects Versions: 1.5.2
> Reporter: Felix Cheung
> Priority: Minor
>
> Since we have read.df, read.json, read.parquet (some in pending PRs), we have
> table() and we should consider having read.table() for consistency and
> R-likeness.
> However, this conflicts with utils::read.table which returns a R data.frame.
> It seems neither table() or read.table() is desirable in this case.
> table: https://stat.ethz.ch/R-manual/R-devel/library/base/html/table.html
> read.table:
> https://stat.ethz.ch/R-manual/R-devel/library/utils/html/read.table.html
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]