[
https://issues.apache.org/jira/browse/SPARK-12148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15051716#comment-15051716
]
Dan Putler commented on SPARK-12148:
------------------------------------
Michael Lawrence's arguments are very valid. The S4Vector package seems to be
an important Bioconductor package, and likely has a lot of users in the
bioinformatics community, which is a community that is also likely to have a
high share of SparkR users, so the name collision issues are real. The one
thing that needs to be thought through is how to mitigate the effect on
existing SparkR code that users have written (I'm less concerned about the
inconsistencies in the naming convention across Scala, Python, and R). Given
the fairly short period of time SparkR has supported DataFrames, the amount of
existing user code is likely not enormous. However, I think it does make sense
to have a transition period of one Spark release where a call to (SparkR)
DataFrame results in a warning that the function is being deprecated, and that
SparkDataFrame (or whatever else we choose to rename it) should be used instead.
> SparkR: rename DataFrame to SparkDataFrame
> ------------------------------------------
>
> Key: SPARK-12148
> URL: https://issues.apache.org/jira/browse/SPARK-12148
> Project: Spark
> Issue Type: Wish
> Components: R, SparkR
> Reporter: Michael Lawrence
> Priority: Minor
>
> The SparkR package represents a Spark DataFrame with the class "DataFrame".
> That conflicts with the more general DataFrame class defined in the S4Vectors
> package. Would it not be more appropriate to use the name "SparkDataFrame"
> instead?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]