In some applications, I have rather heavy use of Java enums which are
needed for related Java APIs that the application uses. And
unfortunately, they are also used as keys. As such, using the native
hashcodes makes any function over keys unstable and unpredictable, so we
now use Enum.name() as the key instead. Oh well. But it works and
seems to work well.
Kevin
On 03/05/2015 09:49 PM, Mridul Muralidharan wrote:
I have a strong dislike for java enum's due to the fact that they
are not stable across JVM's - if it undergoes serde, you end up with
unpredictable results at times [1].
One of the reasons why we prevent enum's from being key : though it is
highly possible users might depend on it internally and shoot
themselves in the foot.
Would be better to keep away from them in general and use something more stable.
Regards,
Mridul
[1] Having had to debug this issue for 2 weeks - I really really hate it.
On Thu, Mar 5, 2015 at 1:08 PM, Imran Rashid <iras...@cloudera.com> wrote:
I have a very strong dislike for #1 (scala enumerations). I'm ok with #4
(with Xiangrui's final suggestion, especially making it sealed & available
in Java), but I really think #2, java enums, are the best option.
Java enums actually have some very real advantages over the other
approaches -- you get values(), valueOf(), EnumSet, and EnumMap. There has
been endless debate in the Scala community about the problems with the
approaches in Scala. Very smart, level-headed Scala gurus have complained
about their short-comings (Rex Kerr's name is coming to mind, though I'm
not positive about that); there have been numerous well-thought out
proposals to give Scala a better enum. But the powers-that-be in Scala
always reject them. IIRC the explanation for rejecting is basically that
(a) enums aren't important enough for introducing some new special feature,
scala's got bigger things to work on and (b) if you really need a good
enum, just use java's enum.
I doubt it really matters that much for Spark internals, which is why I
think #4 is fine. But I figured I'd give my spiel, because every developer
loves language wars :)
Imran
On Thu, Mar 5, 2015 at 1:35 AM, Xiangrui Meng <men...@gmail.com> wrote:
`case object` inside an `object` doesn't show up in Java. This is the
minimal code I found to make everything show up correctly in both
Scala and Java:
sealed abstract class StorageLevel // cannot be a trait
object StorageLevel {
private[this] case object _MemoryOnly extends StorageLevel
final val MemoryOnly: StorageLevel = _MemoryOnly
private[this] case object _DiskOnly extends StorageLevel
final val DiskOnly: StorageLevel = _DiskOnly
}
On Wed, Mar 4, 2015 at 8:10 PM, Patrick Wendell <pwend...@gmail.com>
wrote:
I like #4 as well and agree with Aaron's suggestion.
- Patrick
On Wed, Mar 4, 2015 at 6:07 PM, Aaron Davidson <ilike...@gmail.com>
wrote:
I'm cool with #4 as well, but make sure we dictate that the values
should
be defined within an object with the same name as the enumeration (like
we
do for StorageLevel). Otherwise we may pollute a higher namespace.
e.g. we SHOULD do:
trait StorageLevel
object StorageLevel {
case object MemoryOnly extends StorageLevel
case object DiskOnly extends StorageLevel
}
On Wed, Mar 4, 2015 at 5:37 PM, Michael Armbrust <
mich...@databricks.com>
wrote:
#4 with a preference for CamelCaseEnums
On Wed, Mar 4, 2015 at 5:29 PM, Joseph Bradley <jos...@databricks.com>
wrote:
another vote for #4
People are already used to adding "()" in Java.
On Wed, Mar 4, 2015 at 5:14 PM, Stephen Boesch <java...@gmail.com>
wrote:
#4 but with MemoryOnly (more scala-like)
http://docs.scala-lang.org/style/naming-conventions.html
Constants, Values, Variable and Methods
Constant names should be in upper camel case. That is, if the
member is
final, immutable and it belongs to a package object or an object,
it
may
be
considered a constant (similar to Java'sstatic final members):
1. object Container {
2. val MyConstant = ...
3. }
2015-03-04 17:11 GMT-08:00 Xiangrui Meng <men...@gmail.com>:
Hi all,
There are many places where we use enum-like types in Spark, but
in
different ways. Every approach has both pros and cons. I wonder
whether there should be an "official" approach for enum-like
types in
Spark.
1. Scala's Enumeration (e.g., SchedulingMode, WorkerState, etc)
* All types show up as Enumeration.Value in Java.
http://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SchedulingMode.html
2. Java's Enum (e.g., SaveMode, IOMode)
* Implementation must be in a Java file.
* Values doesn't show up in the ScalaDoc:
http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.network.util.IOMode
3. Static fields in Java (e.g., TripletFields)
* Implementation must be in a Java file.
* Doesn't need "()" in Java code.
* Values don't show up in the ScalaDoc:
http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.graphx.TripletFields
4. Objects in Scala. (e.g., StorageLevel)
* Needs "()" in Java code.
* Values show up in both ScalaDoc and JavaDoc:
http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.storage.StorageLevel$
http://spark.apache.org/docs/latest/api/java/org/apache/spark/storage/StorageLevel.html
It would be great if we have an "official" approach for this as
well
as the naming convention for enum-like values ("MEMORY_ONLY" or
"MemoryOnly"). Personally, I like 4) with "MEMORY_ONLY". Any
thoughts?
Best,
Xiangrui
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org