[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-28 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/5574


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-28 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-97174487
  
LGTM I have merged this into master thanks @vanzin @ilganeli @srowen


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-28 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r29277652
  
--- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala ---
@@ -432,6 +505,21 @@ private[spark] object SparkConf extends Logging {
   AlternateConfig("spark.yarn.applicationMaster.waitTries", "1.3",
 // Translate old value to a duration, with 10s wait time per try.
 translation = s => s"${s.toLong * 10}s")),
+"spark.reducer.maxSizeInFlight" -> Seq(
+  AlternateConfig("spark.reducer.maxMbInFlight", "1.4")),
+"spark.kryoserializer.buffer" ->
+Seq(AlternateConfig("spark.kryoserializer.buffer.mb", "1.4", 
+  translation = s => s"${s.toDouble * 1000}k")),
--- End diff --

very small nit, but I would put the `Seq(` on L510 to be consistent with 
the rest.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-28 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-97172697
  
Hi @ilganeli sorry for slipping on this. I've been busy with my own patches 
and I will look at this right now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-28 Thread ilganeli
Github user ilganeli commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-97111691
  
@andrewor14 How does this look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-96767662
  
  [Test build #31003 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31003/consoleFull)
 for   PR 5574 at commit 
[`11f6999`](https://github.com/apache/spark/commit/11f699948b70ed6dadb4653e2c55013fd60a074d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-27 Thread ilganeli
Github user ilganeli commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-96764362
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-24 Thread ilganeli
Github user ilganeli commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-96087350
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-24 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-96008166
  
  [Test build #30946 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30946/consoleFull)
 for   PR 5574 at commit 
[`11f6999`](https://github.com/apache/spark/commit/11f699948b70ed6dadb4653e2c55013fd60a074d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-24 Thread vanzin
Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-96002601
  
Latest version looks good, thanks for addressing all of the feedback. Just 
a few nits remaining.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-24 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r29067603
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1),
+  KiB (1024L),
+  MiB ((long) Math.pow(1024L, 2L)),
+  GiB ((long) Math.pow(1024L, 3L)),
+  TiB ((long) Math.pow(1024L, 4L)),
+  PiB ((long) Math.pow(1024L, 5L));
+
+  private ByteUnit(long multiplier) {
+this.multiplier = multiplier;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long convertFrom(long d, ByteUnit u) {
+return u.convertTo(d, this);
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convertTo(long d, ByteUnit u) {
+if (multiplier > u.multiplier) {
+  long ratio = multiplier / u.multiplier;
+  if (Long.MAX_VALUE / ratio < d) {
+throw new IllegalArgumentException("Conversion of" + d + "exceeds 
Long.MAX_VALUE in "
+  + name() + ". Try a larger suffix (e.g. MiB instead of KiB)");
+  }
+  return d * ratio;
+} else {
+  // Perform operations in this order to avoid potential overflow 
+  // when computing d * multiplier
+  return d / (u.multiplier / multiplier);
+}
+  }
+
+  public double toBytes(long d) {
+if (d < 0) {
+  throw new IllegalArgumentException("Negative size value. Size must 
be positive: " + d);
+}
+return d * multiplier; 
+  }
+  
+  public long toKiB(long d) { return convertTo(d, KiB); }
+  public long toMiB(long d) { return convertTo(d, MiB); }
+  public long toGiB(long d) { return convertTo(d, GiB); }
+  public long toTiB(long d) { return convertTo(d, TiB); }
+  public long toPiB(long d) { return convertTo(d, PiB); }
+  
+  private final long multiplier;
--- End diff --

nit: fields are generally declared before the constructor.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-24 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r29067540
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1),
+  KiB (1024L),
+  MiB ((long) Math.pow(1024L, 2L)),
+  GiB ((long) Math.pow(1024L, 3L)),
+  TiB ((long) Math.pow(1024L, 4L)),
+  PiB ((long) Math.pow(1024L, 5L));
+
+  private ByteUnit(long multiplier) {
+this.multiplier = multiplier;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long convertFrom(long d, ByteUnit u) {
+return u.convertTo(d, this);
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convertTo(long d, ByteUnit u) {
+if (multiplier > u.multiplier) {
+  long ratio = multiplier / u.multiplier;
+  if (Long.MAX_VALUE / ratio < d) {
+throw new IllegalArgumentException("Conversion of" + d + "exceeds 
Long.MAX_VALUE in "
--- End diff --

nit: you need spaces in the string around the `d` value. You could also use 
`String.format()` although that would probably not make things much cleaner.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-24 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r29067576
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1),
+  KiB (1024L),
+  MiB ((long) Math.pow(1024L, 2L)),
+  GiB ((long) Math.pow(1024L, 3L)),
+  TiB ((long) Math.pow(1024L, 4L)),
+  PiB ((long) Math.pow(1024L, 5L));
+
+  private ByteUnit(long multiplier) {
+this.multiplier = multiplier;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long convertFrom(long d, ByteUnit u) {
+return u.convertTo(d, this);
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convertTo(long d, ByteUnit u) {
+if (multiplier > u.multiplier) {
+  long ratio = multiplier / u.multiplier;
+  if (Long.MAX_VALUE / ratio < d) {
+throw new IllegalArgumentException("Conversion of" + d + "exceeds 
Long.MAX_VALUE in "
+  + name() + ". Try a larger suffix (e.g. MiB instead of KiB)");
--- End diff --

nit: s/suffix/unit?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-24 Thread ilganeli
Github user ilganeli commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r29067422
  
--- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala ---
@@ -407,7 +474,13 @@ private[spark] object SparkConf extends Logging {
 "The spark.cache.class property is no longer being used! Specify 
storage levels using " +
 "the RDD.persist() method instead."),
   DeprecatedConfig("spark.yarn.user.classpath.first", "1.3",
-"Please use spark.{driver,executor}.userClassPathFirst instead."))
+"Please use spark.{driver,executor}.userClassPathFirst instead."),
+  DeprecatedConfig("spark.kryoserializer.buffer.mb", "1.4",
--- End diff --

I wanted to still provide this message to clarify that fractional values 
are no longer allowed.
Thank you,
Ilya Ganelin


From: Marcelo Vanzin 
mailto:notificati...@github.com>>
Reply-To: apache/spark 
mailto:re...@reply.github.com>>
Date: Friday, April 24, 2015 at 10:16 AM
To: apache/spark mailto:sp...@noreply.github.com>>
Cc: "Ganelin, Ilya" 
mailto:ilya.gane...@capitalone.com>>
Subject: Re: [spark] [SPARK-5932][CORE] Use consistent naming for size 
properties (#5574)


In 
core/src/main/scala/org/apache/spark/SparkConf.scala:

> @@ -407,7 +474,13 @@ private[spark] object SparkConf extends Logging {
>  "The spark.cache.class property is no longer being used! Specify 
storage levels using " +
>  "the RDD.persist() method instead."),
>DeprecatedConfig("spark.yarn.user.classpath.first", "1.3",
> -"Please use spark.{driver,executor}.userClassPathFirst 
instead."))
> +"Please use spark.{driver,executor}.userClassPathFirst 
instead."),
> +  DeprecatedConfig("spark.kryoserializer.buffer.mb", "1.4",


Since you added the alternate config, you can remove this entry.

—
Reply to this email directly or view it on 
GitHub.


The information contained in this e-mail is confidential and/or proprietary 
to Capital One and/or its affiliates. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed.  If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-24 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r29067406
  
--- Diff: core/src/test/scala/org/apache/spark/util/UtilsSuite.scala ---
@@ -82,6 +86,107 @@ class UtilsSuite extends FunSuite with 
ResetSystemProperties {
 }
   }
 
+  test("Test byteString conversion") {
+// Test zero
+assert(Utils.byteStringAsBytes("0") === 0)
+
+assert(Utils.byteStringAsGb("1") === 1)
+assert(Utils.byteStringAsGb("1g") === 1)
+assert(Utils.byteStringAsGb("1023m") === 0)
+assert(Utils.byteStringAsGb("1024m") === 1)
+assert(Utils.byteStringAsGb("1048575k") === 0)
+assert(Utils.byteStringAsGb("1048576k") === 1)
+assert(Utils.byteStringAsGb("1k") === 0)
+assert(Utils.byteStringAsGb("1t") === ByteUnit.TiB.toGiB(1))
+assert(Utils.byteStringAsGb("1p") === ByteUnit.PiB.toGiB(1))
+
+assert(Utils.byteStringAsMb("1") === 1)
+assert(Utils.byteStringAsMb("1m") === 1)
+assert(Utils.byteStringAsMb("1048575b") === 0)
+assert(Utils.byteStringAsMb("1048576b") === 1)
+assert(Utils.byteStringAsMb("1023k") === 0)
+assert(Utils.byteStringAsMb("1024k") === 1)
+assert(Utils.byteStringAsMb("3645k") === 3)
+assert(Utils.byteStringAsMb("1024gb") === 1048576)
+assert(Utils.byteStringAsMb("1g") === ByteUnit.GiB.toMiB(1))
+assert(Utils.byteStringAsMb("1t") === ByteUnit.TiB.toMiB(1))
+assert(Utils.byteStringAsMb("1p") === ByteUnit.PiB.toMiB(1))
+
+assert(Utils.byteStringAsKb("1") === 1)
+assert(Utils.byteStringAsKb("1k") === 1)
+assert(Utils.byteStringAsKb("1m") === ByteUnit.MiB.toKiB(1))
+assert(Utils.byteStringAsKb("1g") === ByteUnit.GiB.toKiB(1))
+assert(Utils.byteStringAsKb("1t") === ByteUnit.TiB.toKiB(1))
+assert(Utils.byteStringAsKb("1p") === ByteUnit.PiB.toKiB(1))
+
+assert(Utils.byteStringAsBytes("1") === 1)
+assert(Utils.byteStringAsBytes("1k") === ByteUnit.KiB.toBytes(1))
+assert(Utils.byteStringAsBytes("1m") === ByteUnit.MiB.toBytes(1))
+assert(Utils.byteStringAsBytes("1g") === ByteUnit.GiB.toBytes(1))
+assert(Utils.byteStringAsBytes("1t") === ByteUnit.TiB.toBytes(1))
+assert(Utils.byteStringAsBytes("1p") === ByteUnit.PiB.toBytes(1))
+
+// Overflow handling, 1073741824p exceeds Long.MAX_VALUE if converted 
straight to Bytes
+// This demonstrates that we can have e.g 1024^3 PB without 
overflowing. 
+assert(Utils.byteStringAsGb("1073741824p") === 
ByteUnit.PiB.toGiB(1073741824))
+assert(Utils.byteStringAsMb("1073741824p") === 
ByteUnit.PiB.toMiB(1073741824))
+
+// Run this to confirm it doesn't throw an exception
+assert(Utils.byteStringAsBytes("9223372036854775807") === 
9223372036854775807L) 
+assert(ByteUnit.PiB.toPiB(9223372036854775807L) === 
9223372036854775807L)
+
+// Test overflow exception
+intercept[IllegalArgumentException] {
+  // This value exceeds Long.MAX when converted to bytes 
+  Utils.byteStringAsBytes("9223372036854775808")
+}
+
+// Test overflow exception
+intercept[IllegalArgumentException] {
+  // This value exceeds Long.MAX when converted to TB
+  ByteUnit.PiB.toTiB(9223372036854775807L)
+}
+
+// Test fractional string
+intercept[NumberFormatException] {
+  Utils.byteStringAsMb("0.064")
+}
+
+// Test fractional string
+intercept[NumberFormatException] {
+  Utils.byteStringAsMb("0.064m")
+}
+
+// Test invalid strings
+intercept[NumberFormatException] {
+  Utils.byteStringAsBytes("500ub")
+}
+
+// Test invalid strings
+intercept[NumberFormatException] {
+  Utils.byteStringAsBytes("This breaks 600b")
+}
+
+intercept[NumberFormatException] {
+  Utils.byteStringAsBytes("This breaks 600")
+}
+
+intercept[NumberFormatException] {
+  Utils.byteStringAsBytes("600gb This breaks")
+}
+
+intercept[NumberFormatException] {
+  Utils.byteStringAsBytes("This 123mb breaks")
+}
+
+//// Test overflow
--- End diff --

Delete these lines?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@sp

[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-24 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r29067209
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -1041,21 +1041,48 @@ private[spark] object Utils extends Logging {
   }
 
   /**
-   * Convert a Java memory parameter passed to -Xmx (such as 300m or 1g) 
to a number of megabytes.
+   * Convert a passed byte string (e.g. 50b, 100k, or 250m) to bytes for 
internal use.
+   *
+   * If no suffix is provided, the passed number is assumed to be in bytes.
+   */
+  def byteStringAsBytes(str: String): Long = {
+JavaUtils.byteStringAsBytes(str)
+  }
+
+  /**
+   * Convert a passed byte string (e.g. 50b, 100k, or 250m) to kibibytes 
for internal use.
+   *
+   * If no suffix is provided, the passed number is assumed to be in 
kibibytes.
+   */
+  def byteStringAsKb(str: String): Long = {
+JavaUtils.byteStringAsKb(str)
+  }
+
+  /**
+   * Convert a passed byte string (e.g. 50b, 100k, or 250m) to mebibytes 
for internal use.
+   *
+   * If no suffix is provided, the passed number is assumed to be in 
mebibytes.
+   */
+  def byteStringAsMb(str: String): Long = {
+JavaUtils.byteStringAsMb(str)
+  }
+
+  /**
+   * Convert a passed byte string (e.g. 50b, 100k, or 250m, 500g) to 
gibibytes for internal use.
+   *
+   * If no suffix is provided, the passed number is assumed to be in 
gibibytes.
+   */
+  def byteStringAsGb(str: String): Long = {
+JavaUtils.byteStringAsGb(str)
+  }
+
+  /**
+   * Convert a Java memory parameter passed to -Xmx (such as 300m or 1g) 
to a number of mebibytes.
*/
   def memoryStringToMb(str: String): Int = {
-val lower = str.toLowerCase
-if (lower.endsWith("k")) {
-  (lower.substring(0, lower.length-1).toLong / 1024).toInt
-} else if (lower.endsWith("m")) {
-  lower.substring(0, lower.length-1).toInt
-} else if (lower.endsWith("g")) {
-  lower.substring(0, lower.length-1).toInt * 1024
-} else if (lower.endsWith("t")) {
-  lower.substring(0, lower.length-1).toInt * 1024 * 1024
-} else {// no suffix, so it's just a number in bytes
-  (lower.toLong / 1024 / 1024).toInt
-}
+// Convert to bytes, rather than directly to MB, because when no units 
are specified the unit
+// is assumed to be bytes
+(JavaUtils.byteStringAsBytes(str) / 1024 / 1024).toInt
--- End diff --

super nit: you could drop the `JavaUtils.` prefix in the call.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-24 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r29066890
  
--- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala ---
@@ -407,7 +474,13 @@ private[spark] object SparkConf extends Logging {
 "The spark.cache.class property is no longer being used! Specify 
storage levels using " +
 "the RDD.persist() method instead."),
   DeprecatedConfig("spark.yarn.user.classpath.first", "1.3",
-"Please use spark.{driver,executor}.userClassPathFirst instead."))
+"Please use spark.{driver,executor}.userClassPathFirst instead."),
+  DeprecatedConfig("spark.kryoserializer.buffer.mb", "1.4",
--- End diff --

Since you added the alternate config, you can remove this entry.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-24 Thread ilganeli
Github user ilganeli commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95976726
  
@andrewor14 what do you think about this in its present state? I think it 
should be getting near done. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95777565
  
  [Test build #30889 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30889/consoleFull)
 for   PR 5574 at commit 
[`49a8720`](https://github.com/apache/spark/commit/49a8720c32ac191248e23428092192c5bf9faed9).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95777572
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30889/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95764596
  
  [Test build #30889 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30889/consoleFull)
 for   PR 5574 at commit 
[`49a8720`](https://github.com/apache/spark/commit/49a8720c32ac191248e23428092192c5bf9faed9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95763818
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30888/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95763816
  
  [Test build #30888 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30888/consoleFull)
 for   PR 5574 at commit 
[`2ab886b`](https://github.com/apache/spark/commit/2ab886b034df177eb19fbcccd05b675c76d0369a).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95763690
  
  [Test build #30888 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30888/consoleFull)
 for   PR 5574 at commit 
[`2ab886b`](https://github.com/apache/spark/commit/2ab886b034df177eb19fbcccd05b675c76d0369a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95762985
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30887/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95762983
  
  [Test build #30887 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30887/consoleFull)
 for   PR 5574 at commit 
[`fc85733`](https://github.com/apache/spark/commit/fc85733a9489ba008df8aadc61971136d58b79be).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95762512
  
  [Test build #30887 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30887/consoleFull)
 for   PR 5574 at commit 
[`fc85733`](https://github.com/apache/spark/commit/fc85733a9489ba008df8aadc61971136d58b79be).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r29013679
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1, "Bytes"),
+  KiB (1024, "KiB"),
+  MiB (Math.pow(1024, 2), "MiB"),
+  GiB (Math.pow(1024, 3), "GiB"),
+  TiB (Math.pow(1024, 4), "TiB"),
+  PiB (Math.pow(1024, 5), "PiB");
+
+  private ByteUnit(double multiplier, String name) {
+this.multiplier = multiplier;
+this.name = name;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long convertFrom(long d, ByteUnit u) {
+double converted = u.toBytes(d) / multiplier;
+if (converted > Long.MAX_VALUE)
+  throw new IllegalArgumentException("Converted value (" + converted + 
") " +
+"exceeds Long.MAX_VALUE for " + name + ". Try a larger suffix 
(e.g. MiB instead of KiB)");
+return (long) converted;  
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convertTo(long d, ByteUnit u) {
+double converted = toBytes(d) / u.multiplier;
+if (converted > Long.MAX_VALUE)
+  throw new IllegalArgumentException("Converted value (" + converted + 
") " +
+"exceeds Long.MAX_VALUE for " + u.name + ". Try a larger suffix 
(e.g. MiB instead of KiB)");
+
+return (long) converted;
+  }
+
+  public double toBytes(long d) {
--- End diff --

Just to illustrate my concern, these are the kinds of things you run into 
when working with floating-point numbers:

scala> (1E15).asInstanceOf[Long] == (1E15+1).asInstanceOf[Long]
res0: Boolean = false

scala> (1E16).asInstanceOf[Long] == (1E16+1).asInstanceOf[Long]
res1: Boolean = true



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r29013473
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1, "Bytes"),
+  KiB (1024, "KiB"),
+  MiB (Math.pow(1024, 2), "MiB"),
+  GiB (Math.pow(1024, 3), "GiB"),
+  TiB (Math.pow(1024, 4), "TiB"),
+  PiB (Math.pow(1024, 5), "PiB");
+
+  private ByteUnit(double multiplier, String name) {
+this.multiplier = multiplier;
+this.name = name;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long convertFrom(long d, ByteUnit u) {
+double converted = u.toBytes(d) / multiplier;
+if (converted > Long.MAX_VALUE)
+  throw new IllegalArgumentException("Converted value (" + converted + 
") " +
+"exceeds Long.MAX_VALUE for " + name + ". Try a larger suffix 
(e.g. MiB instead of KiB)");
+return (long) converted;  
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convertTo(long d, ByteUnit u) {
+double converted = toBytes(d) / u.multiplier;
+if (converted > Long.MAX_VALUE)
+  throw new IllegalArgumentException("Converted value (" + converted + 
") " +
+"exceeds Long.MAX_VALUE for " + u.name + ". Try a larger suffix 
(e.g. MiB instead of KiB)");
+
+return (long) converted;
+  }
+
+  public double toBytes(long d) {
+if (d < 0) {
+  throw new IllegalArgumentException("Negative size value. Size must 
be positive: " + d);
+}
+return d * multiplier; 
+  }
+  
+  public long toKiB(long d) { return convertTo(d, KiB); }
+  public long toMiB(long d) { return convertTo(d, MiB); }
+  public long toGiB(long d) { return convertTo(d, GiB); }
+  public long toTiB(long d) { return convertTo(d, TiB); }
+  public long toPiB(long d) { return convertTo(d, PiB); }
+  
+  private double multiplier = 0;
+  private String name = "";
--- End diff --

In fact you don't need `name` at all. Just call `name()`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r29013124
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1, "Bytes"),
+  KiB (1024, "KiB"),
+  MiB (Math.pow(1024, 2), "MiB"),
+  GiB (Math.pow(1024, 3), "GiB"),
+  TiB (Math.pow(1024, 4), "TiB"),
+  PiB (Math.pow(1024, 5), "PiB");
+
+  private ByteUnit(double multiplier, String name) {
+this.multiplier = multiplier;
+this.name = name;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long convertFrom(long d, ByteUnit u) {
+double converted = u.toBytes(d) / multiplier;
+if (converted > Long.MAX_VALUE)
+  throw new IllegalArgumentException("Converted value (" + converted + 
") " +
+"exceeds Long.MAX_VALUE for " + name + ". Try a larger suffix 
(e.g. MiB instead of KiB)");
+return (long) converted;  
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convertTo(long d, ByteUnit u) {
+double converted = toBytes(d) / u.multiplier;
+if (converted > Long.MAX_VALUE)
+  throw new IllegalArgumentException("Converted value (" + converted + 
") " +
+"exceeds Long.MAX_VALUE for " + u.name + ". Try a larger suffix 
(e.g. MiB instead of KiB)");
+
+return (long) converted;
+  }
+
+  public double toBytes(long d) {
+if (d < 0) {
+  throw new IllegalArgumentException("Negative size value. Size must 
be positive: " + d);
+}
+return d * multiplier; 
+  }
+  
+  public long toKiB(long d) { return convertTo(d, KiB); }
+  public long toMiB(long d) { return convertTo(d, MiB); }
+  public long toGiB(long d) { return convertTo(d, GiB); }
+  public long toTiB(long d) { return convertTo(d, TiB); }
+  public long toPiB(long d) { return convertTo(d, PiB); }
+  
+  private double multiplier = 0;
+  private String name = "";
--- End diff --

These two fields need to be final, as I mentioned before. Just don't 
initialize them here.

`multiplier`, also, should be a `long`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r29013092
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1, "Bytes"),
+  KiB (1024, "KiB"),
+  MiB (Math.pow(1024, 2), "MiB"),
+  GiB (Math.pow(1024, 3), "GiB"),
+  TiB (Math.pow(1024, 4), "TiB"),
+  PiB (Math.pow(1024, 5), "PiB");
+
+  private ByteUnit(double multiplier, String name) {
+this.multiplier = multiplier;
+this.name = name;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long convertFrom(long d, ByteUnit u) {
+double converted = u.toBytes(d) / multiplier;
+if (converted > Long.MAX_VALUE)
+  throw new IllegalArgumentException("Converted value (" + converted + 
") " +
+"exceeds Long.MAX_VALUE for " + name + ". Try a larger suffix 
(e.g. MiB instead of KiB)");
+return (long) converted;  
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convertTo(long d, ByteUnit u) {
+double converted = toBytes(d) / u.multiplier;
+if (converted > Long.MAX_VALUE)
+  throw new IllegalArgumentException("Converted value (" + converted + 
") " +
+"exceeds Long.MAX_VALUE for " + u.name + ". Try a larger suffix 
(e.g. MiB instead of KiB)");
+
+return (long) converted;
+  }
+
+  public double toBytes(long d) {
--- End diff --

As I mentioned, I do not think that using double is the correct thing. Not 
only the API looks weird (`toBytes` returns double while the others return 
`long`), but doubles do not have infinite precision.

Instead, for example, have this:

public long convertFrom(long d, ByteUnit u) {
  return u.convertTo(d, this);
}

public long convertTo(long d, ByteUnit u) {
  if (multiplier > u.multiplier) {
long ratio = multiplier / u.multiplier;
if (Long.MAX_VALUE / ratio < d) {
  throw new IllegalArgumentException("OVERFLOW!");
}
return d * ratio;
  } else {
return d / (u.multiplier / multiplier);
  }
}



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95730028
  
  [Test build #30878 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30878/consoleFull)
 for   PR 5574 at commit 
[`852a407`](https://github.com/apache/spark/commit/852a407b469a12345a3253c62871e186b2ddb06c).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95730037
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30878/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95729606
  
  [Test build #30878 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30878/consoleFull)
 for   PR 5574 at commit 
[`852a407`](https://github.com/apache/spark/commit/852a407b469a12345a3253c62871e186b2ddb06c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread ilganeli
Github user ilganeli commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95728902
  
@vanzin Thanks for the continued feedback. I've revamped how I handle 
overflow and added test cases for various boundary conditions. I think this 
should address the prior concerns. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28990766
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1),
+  KiB (1024L),
+  MiB ((long) Math.pow(1024L, 2L)),
+  GiB ((long) Math.pow(1024L, 3L)),
+  TiB ((long) Math.pow(1024L, 4L)),
+  PiB ((long) Math.pow(1024L, 5L));
+
+  private ByteUnit(long multiplier) {
+this.multiplier = multiplier;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long interpret(long d, ByteUnit u) {
+return u.toBytes(d) / multiplier;  
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convert(long d, ByteUnit u) {
+return toBytes(d) / u.multiplier;
--- End diff --

I saw your comment about using double - I don't think that's a great idea 
because doubles lose precision as you try to work with values at different 
orders of magniture.

Regarding the last paragraph of my comment above, I don't think it's going 
to be an issue in practice; but the code here can be changed to at least avoid 
overflows where possible. I checked `j.u.c.TimeUnit`, used in the time 
functions in this class, and it seems to follow the approach you took, than 
when an overflow is inevitable it caps the value at `Long.MAX_VALUE`. So that 
part is fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-23 Thread ilganeli
Github user ilganeli commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28989856
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1),
+  KiB (1024L),
+  MiB ((long) Math.pow(1024L, 2L)),
+  GiB ((long) Math.pow(1024L, 3L)),
+  TiB ((long) Math.pow(1024L, 4L)),
+  PiB ((long) Math.pow(1024L, 5L));
+
+  private ByteUnit(long multiplier) {
+this.multiplier = multiplier;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long interpret(long d, ByteUnit u) {
+return u.toBytes(d) / multiplier;  
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convert(long d, ByteUnit u) {
+return toBytes(d) / u.multiplier;
--- End diff --

Marcelo - I think this could be readily solved if ```toBytes``` returns a 
double. Max for a double is 1.79769e+308 which is more than we conceivably ever 
need and would solve the overflow issue (we'd just need to check that the 
resulting number if less than Long.MAX_VALUE and throw an exception if it's 
not). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28929956
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1),
+  KiB (1024L),
+  MiB ((long) Math.pow(1024L, 2L)),
+  GiB ((long) Math.pow(1024L, 3L)),
+  TiB ((long) Math.pow(1024L, 4L)),
+  PiB ((long) Math.pow(1024L, 5L));
+
+  private ByteUnit(long multiplier) {
+this.multiplier = multiplier;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long interpret(long d, ByteUnit u) {
+return u.toBytes(d) / multiplier;  
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convert(long d, ByteUnit u) {
+return toBytes(d) / u.multiplier;
--- End diff --

If you want to be really correct here, you could avoid overflows by playing 
with the multipliers instead of converting things to bytes first.

I think what's bugging me is that the semantics of all these methods are a 
little weird. It seems like you're trying to cap the maximum amount to be 
represented to `Long.MAX_VALUE` *bytes* (so that having `Long.MAX_VALUE` PB, 
for example, would be wrong since you can't convert that to bytes). I'm not 
sure that's needed, but if you want that, it should be enforced differently 
(and not here). Otherwise, I'd rework these methods to avoid overflows where 
possible, and throw exceptions when they would happen.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28929821
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1),
+  KiB (1024L),
+  MiB ((long) Math.pow(1024L, 2L)),
+  GiB ((long) Math.pow(1024L, 3L)),
+  TiB ((long) Math.pow(1024L, 4L)),
+  PiB ((long) Math.pow(1024L, 5L));
+
+  private ByteUnit(long multiplier) {
+this.multiplier = multiplier;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long interpret(long d, ByteUnit u) {
--- End diff --

How about `convertTo` and `convertFrom` instead?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28929776
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1),
+  KiB (1024L),
+  MiB ((long) Math.pow(1024L, 2L)),
+  GiB ((long) Math.pow(1024L, 3L)),
+  TiB ((long) Math.pow(1024L, 4L)),
+  PiB ((long) Math.pow(1024L, 5L));
+
+  private ByteUnit(long multiplier) {
+this.multiplier = multiplier;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long interpret(long d, ByteUnit u) {
+return u.toBytes(d) / multiplier;  
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convert(long d, ByteUnit u) {
+return toBytes(d) / u.multiplier;
+  }
+
+  public long toBytes(long d) {
+if (d == 0) { return 0; }
+long over = MAX / d;
+if (d >  over) return Long.MAX_VALUE;
+if (d < -over) return Long.MIN_VALUE;
--- End diff --

Negative byte counts sound a little weird, but well. Same comment as above, 
though.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28929761
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1),
+  KiB (1024L),
+  MiB ((long) Math.pow(1024L, 2L)),
+  GiB ((long) Math.pow(1024L, 3L)),
+  TiB ((long) Math.pow(1024L, 4L)),
+  PiB ((long) Math.pow(1024L, 5L));
+
+  private ByteUnit(long multiplier) {
+this.multiplier = multiplier;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long interpret(long d, ByteUnit u) {
+return u.toBytes(d) / multiplier;  
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convert(long d, ByteUnit u) {
+return toBytes(d) / u.multiplier;
+  }
+
+  public long toBytes(long d) {
+if (d == 0) { return 0; }
+long over = MAX / d;
+if (d >  over) return Long.MAX_VALUE;
--- End diff --

Hmmm... I feel like it would be better to throw an exception instead of 
truncating.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28929734
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1),
+  KiB (1024L),
+  MiB ((long) Math.pow(1024L, 2L)),
+  GiB ((long) Math.pow(1024L, 3L)),
+  TiB ((long) Math.pow(1024L, 4L)),
+  PiB ((long) Math.pow(1024L, 5L));
+
+  private ByteUnit(long multiplier) {
+this.multiplier = multiplier;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long interpret(long d, ByteUnit u) {
+return u.toBytes(d) / multiplier;  
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convert(long d, ByteUnit u) {
+return toBytes(d) / u.multiplier;
+  }
+
+  public long toBytes(long d) {
+if (d == 0) { return 0; }
+long over = MAX / d;
--- End diff --

Doesn't seem worth it to have a private `MAX` just for this one use. Use 
`Long.MAX_VALUE`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95363048
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30789/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95363036
  
  [Test build #30789 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30789/consoleFull)
 for   PR 5574 at commit 
[`22413b1`](https://github.com/apache/spark/commit/22413b19e9323f5df45cfb6d1758baa709382184).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95345676
  
  [Test build #30789 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30789/consoleFull)
 for   PR 5574 at commit 
[`22413b1`](https://github.com/apache/spark/commit/22413b19e9323f5df45cfb6d1758baa709382184).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28916298
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/JavaUtils.java ---
@@ -186,5 +199,84 @@ public static long timeStringAsMs(String str) {
   public static long timeStringAsSec(String str) {
 return parseTimeString(str, TimeUnit.SECONDS);
   }
+  
+  /**
+   * Convert a passed byte string (e.g. 50b, 100kb, or 250mb) to a 
ByteUnit for
+   * internal use. If no suffix is provided a direct conversion of the 
provided default is 
+   * attempted.
+   */
+  private static long parseByteString(String str, ByteUnit unit) {
+String lower = str.toLowerCase().trim();
+
+try {
+  Matcher m = Pattern.compile("([0-9]+)([a-z]+)?").matcher(lower);
+  Matcher fractionMatcher = 
Pattern.compile("([0-9]*\\.[0-9]*)([a-z]+)?").matcher(lower);
+  
+  if(m.matches()) {
+long val = Long.parseLong(m.group(1));
+String suffix = m.group(2);
+
+// Check for invalid suffixes
+if (suffix != null && !byteSuffixes.containsKey(suffix)) {
+  throw new NumberFormatException("Invalid suffix: \"" + suffix + 
"\"");
+}
+
+// If suffix is valid use that, otherwise none was provided and 
use the default passed
+return unit.interpret(val, suffix != null ? 
byteSuffixes.get(suffix) : unit);  
+  } else if (fractionMatcher.matches()) {
+double val = Double.parseDouble(fractionMatcher.group(1));
+
+throw new NumberFormatException("Fractional values are not 
supported. Input was: " + val);
--- End diff --

In that case, `Double.parseDouble` is redundant, you can just use 
`fractionMatcher.group(1)` to get the match. Also, in the `fractionMatcher` 
regex, it should be `+` instead of `*`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28916137
  
--- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala ---
@@ -432,6 +505,21 @@ private[spark] object SparkConf extends Logging {
   AlternateConfig("spark.yarn.applicationMaster.waitTries", "1.3",
 // Translate old value to a duration, with 10s wait time per try.
 translation = s => s"${s.toLong * 10}s")),
+"spark.reducer.maxSizeInFlight" -> Seq(
+  AlternateConfig("spark.reducer.maxMbInFlight", "1.4")),
+"spark.kryoserializer.buffer" ->
+Seq(AlternateConfig("spark.kryoserializer.buffer.mb", "1.4", 
+  translation = s => s"${s.toDouble * 1000}k")),
--- End diff --

I think since you're adding the new config, it's fine to not allow the old 
config take the new style. If you really want to support that, you could try to 
parse the config using the new API in an exception handler.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread ilganeli
Github user ilganeli commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28916065
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/JavaUtils.java ---
@@ -186,5 +199,84 @@ public static long timeStringAsMs(String str) {
   public static long timeStringAsSec(String str) {
 return parseTimeString(str, TimeUnit.SECONDS);
   }
+  
+  /**
+   * Convert a passed byte string (e.g. 50b, 100kb, or 250mb) to a 
ByteUnit for
+   * internal use. If no suffix is provided a direct conversion of the 
provided default is 
+   * attempted.
+   */
+  private static long parseByteString(String str, ByteUnit unit) {
+String lower = str.toLowerCase().trim();
+
+try {
+  Matcher m = Pattern.compile("([0-9]+)([a-z]+)?").matcher(lower);
+  Matcher fractionMatcher = 
Pattern.compile("([0-9]*\\.[0-9]*)([a-z]+)?").matcher(lower);
+  
+  if(m.matches()) {
+long val = Long.parseLong(m.group(1));
+String suffix = m.group(2);
+
+// Check for invalid suffixes
+if (suffix != null && !byteSuffixes.containsKey(suffix)) {
+  throw new NumberFormatException("Invalid suffix: \"" + suffix + 
"\"");
+}
+
+// If suffix is valid use that, otherwise none was provided and 
use the default passed
+return unit.interpret(val, suffix != null ? 
byteSuffixes.get(suffix) : unit);  
+  } else if (fractionMatcher.matches()) {
+double val = Double.parseDouble(fractionMatcher.group(1));
+
+throw new NumberFormatException("Fractional values are not 
supported. Input was: " + val);
--- End diff --

I wanted to have a separate error message. Otherwise, it appears that the 
error has to do with parsing the suffix, not the numerical value. This seemed 
like the cleanest way to check for a fractional value. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28915996
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1),
+  KiB (1024L),
+  MiB ((long) Math.pow(1024L, 2L)),
+  GiB ((long) Math.pow(1024L, 3L)),
+  TiB ((long) Math.pow(1024L, 4L)),
+  PiB ((long) Math.pow(1024L, 5L));
+
+  private ByteUnit(long multiplier) {
+this.multiplier = multiplier;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long interpret(long d, ByteUnit u) {
+return u.toBytes(d) / multiplier;  
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convert(long d, ByteUnit u) {
+return toBytes(d) / u.multiplier;
+  }
+
+  public long toBytes(long d) {
+if (d == 0) { return 0; }
+long over = MAX / d;
+if (d >  over) return Long.MAX_VALUE;
+if (d < -over) return Long.MIN_VALUE;
+return d * multiplier; 
+  }
+  public long toKiB(long d) { return convert(d, KiB); }
+  public long toMiB(long d) { return convert(d, MiB); }
+  public long toGiB(long d) { return convert(d, GiB); }
+  public long toTiB(long d) { return convert(d, TiB); }
+  public long toPiB(long d) { return convert(d, PiB); }
+  
+  private long multiplier = 0;
+  static final long MAX = Long.MAX_VALUE;
--- End diff --

If you want this to be available, then it should be public. But I'd avoid 
adding it until it's actually needed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28915845
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1),
+  KiB (1024l),
+  MiB ((long) Math.pow(1024l, 2l)),
+  GiB ((long) Math.pow(1024l, 3l)),
+  TiB ((long) Math.pow(1024l, 4l)),
+  PiB ((long) Math.pow(1024l, 5l));
+
+  private ByteUnit(long multiplier) {
+this.multiplier = multiplier;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long interpret(long d, ByteUnit u) {
+return u.toBytes(d) / multiplier;  
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convert(long d, ByteUnit u) {
+return toBytes(d) / u.multiplier;
+  }
+
+  public long toBytes(long d) { return x(d, multiplier); }
+  public long toKiB(long d) { return convert(d, KiB); }
+  public long toMiB(long d) { return convert(d, MiB); }
+  public long toGiB(long d) { return convert(d, GiB); }
+  public long toTiB(long d) { return convert(d, TiB); }
+  public long toPiB(long d) { return convert(d, PiB); }
+  
+  long multiplier = 0;
--- End diff --

That's ok. This is Java, not Scala.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28914579
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/JavaUtils.java ---
@@ -186,5 +199,84 @@ public static long timeStringAsMs(String str) {
   public static long timeStringAsSec(String str) {
 return parseTimeString(str, TimeUnit.SECONDS);
   }
+  
+  /**
+   * Convert a passed byte string (e.g. 50b, 100kb, or 250mb) to a 
ByteUnit for
+   * internal use. If no suffix is provided a direct conversion of the 
provided default is 
+   * attempted.
+   */
+  private static long parseByteString(String str, ByteUnit unit) {
+String lower = str.toLowerCase().trim();
+
+try {
+  Matcher m = Pattern.compile("([0-9]+)([a-z]+)?").matcher(lower);
+  Matcher fractionMatcher = 
Pattern.compile("([0-9]*\\.[0-9]*)([a-z]+)?").matcher(lower);
+  
+  if(m.matches()) {
+long val = Long.parseLong(m.group(1));
+String suffix = m.group(2);
+
+// Check for invalid suffixes
+if (suffix != null && !byteSuffixes.containsKey(suffix)) {
+  throw new NumberFormatException("Invalid suffix: \"" + suffix + 
"\"");
+}
+
+// If suffix is valid use that, otherwise none was provided and 
use the default passed
+return unit.interpret(val, suffix != null ? 
byteSuffixes.get(suffix) : unit);  
+  } else if (fractionMatcher.matches()) {
+double val = Double.parseDouble(fractionMatcher.group(1));
+
+throw new NumberFormatException("Fractional values are not 
supported. Input was: " + val);
--- End diff --

hmmm... why bother with the match then?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95330076
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30783/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95330051
  
  [Test build #30783 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30783/consoleFull)
 for   PR 5574 at commit 
[`3dfae96`](https://github.com/apache/spark/commit/3dfae96ee8f594e4136c6916326ef7e0fe70b4be).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95325867
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30774/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95325846
  
  [Test build #30774 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30774/consoleFull)
 for   PR 5574 at commit 
[`e428049`](https://github.com/apache/spark/commit/e428049e516a6b4c8bb13074d44098d8d7ded535).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95308841
  
  [Test build #30783 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30783/consoleFull)
 for   PR 5574 at commit 
[`3dfae96`](https://github.com/apache/spark/commit/3dfae96ee8f594e4136c6916326ef7e0fe70b4be).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread ilganeli
Github user ilganeli commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28905078
  
--- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala ---
@@ -432,6 +505,21 @@ private[spark] object SparkConf extends Logging {
   AlternateConfig("spark.yarn.applicationMaster.waitTries", "1.3",
 // Translate old value to a duration, with 10s wait time per try.
 translation = s => s"${s.toLong * 10}s")),
+"spark.reducer.maxSizeInFlight" -> Seq(
+  AlternateConfig("spark.reducer.maxMbInFlight", "1.4")),
+"spark.kryoserializer.buffer" ->
+Seq(AlternateConfig("spark.kryoserializer.buffer.mb", "1.4", 
+  translation = s => s"${s.toDouble * 1000}k")),
--- End diff --

This automatic translation may throw a NumberFormatException if someone 
tries to use the .mb parameter as "64k" (e.g. the correct new format). Is that 
a case we should be concerned with? There will be enough warnings and errors 
thrown for them to readily track down the problem and fix the erroneous config 
so this should be ok but want to confirm that. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread ilganeli
Github user ilganeli commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28904736
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1),
+  KiB (1024l),
+  MiB ((long) Math.pow(1024l, 2l)),
+  GiB ((long) Math.pow(1024l, 3l)),
+  TiB ((long) Math.pow(1024l, 4l)),
+  PiB ((long) Math.pow(1024l, 5l));
+
+  private ByteUnit(long multiplier) {
+this.multiplier = multiplier;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long interpret(long d, ByteUnit u) {
+return u.toBytes(d) / multiplier;  
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convert(long d, ByteUnit u) {
+return toBytes(d) / u.multiplier;
+  }
+
+  public long toBytes(long d) { return x(d, multiplier); }
+  public long toKiB(long d) { return convert(d, KiB); }
+  public long toMiB(long d) { return convert(d, MiB); }
+  public long toGiB(long d) { return convert(d, GiB); }
+  public long toTiB(long d) { return convert(d, TiB); }
+  public long toPiB(long d) { return convert(d, PiB); }
+  
+  long multiplier = 0;
--- End diff --

Multiplier is not final - it's set in the constructor.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread ilganeli
Github user ilganeli commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28904199
  
--- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala ---
@@ -407,7 +474,13 @@ private[spark] object SparkConf extends Logging {
 "The spark.cache.class property is no longer being used! Specify 
storage levels using " +
 "the RDD.persist() method instead."),
   DeprecatedConfig("spark.yarn.user.classpath.first", "1.3",
-"Please use spark.{driver,executor}.userClassPathFirst instead."))
+"Please use spark.{driver,executor}.userClassPathFirst instead."),
+  DeprecatedConfig("spark.kryoserializer.buffer.mb", "1.4",
--- End diff --

Automatic translation would fail with a NumberFormatException though if 
they try to pass in a correctly formatted new value for the old config (e.g. 
64k) and there's nowhere to catch that exception in this code.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28901683
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/JavaUtils.java ---
@@ -186,5 +199,84 @@ public static long timeStringAsMs(String str) {
   public static long timeStringAsSec(String str) {
 return parseTimeString(str, TimeUnit.SECONDS);
   }
+  
+  /**
+   * Convert a passed byte string (e.g. 50b, 100kb, or 250mb) to a 
ByteUnit for
+   * internal use. If no suffix is provided a direct conversion of the 
provided default is 
+   * attempted.
+   */
+  private static long parseByteString(String str, ByteUnit unit) {
+String lower = str.toLowerCase().trim();
+
+try {
+  Matcher m = Pattern.compile("([0-9]+)([a-z]+)?").matcher(lower);
+  Matcher fractionMatcher = 
Pattern.compile("([0-9]*\\.[0-9]*)([a-z]+)?").matcher(lower);
+  
+  if(m.matches()) {
--- End diff --

nit: `if (`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread ilganeli
Github user ilganeli commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28901624
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1),
+  KiB (1024l),
+  MiB ((long) Math.pow(1024l, 2l)),
+  GiB ((long) Math.pow(1024l, 3l)),
+  TiB ((long) Math.pow(1024l, 4l)),
+  PiB ((long) Math.pow(1024l, 5l));
+
+  private ByteUnit(long multiplier) {
+this.multiplier = multiplier;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long interpret(long d, ByteUnit u) {
+return u.toBytes(d) / multiplier;  
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convert(long d, ByteUnit u) {
+return toBytes(d) / u.multiplier;
+  }
+
+  public long toBytes(long d) { return x(d, multiplier); }
+  public long toKiB(long d) { return convert(d, KiB); }
+  public long toMiB(long d) { return convert(d, MiB); }
+  public long toGiB(long d) { return convert(d, GiB); }
+  public long toTiB(long d) { return convert(d, TiB); }
+  public long toPiB(long d) { return convert(d, PiB); }
+  
+  long multiplier = 0;
+  static final long MAX = Long.MAX_VALUE;
--- End diff --

Wanted to still keep these in case they were used down the line. Would you 
recommend getting rid of them?
Thank you,
Ilya Ganelin


From: Marcelo Vanzin 
mailto:notificati...@github.com>>
Reply-To: apache/spark 
mailto:re...@reply.github.com>>
Date: Wednesday, April 22, 2015 at 11:35 AM
To: apache/spark mailto:sp...@noreply.github.com>>
Cc: "Ganelin, Ilya" 
mailto:ilya.gane...@capitalone.com>>
Subject: Re: [spark] [SPARK-5932][CORE] Use consistent naming for size 
properties (#5574)


In 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java:

> +  }
> +
> +  // Convert the provided number (d) interpreted as this unit type to 
unit type (u).
> +  public long convert(long d, ByteUnit u) {
> +return toBytes(d) / u.multiplier;
> +  }
> +
> +  public long toBytes(long d) { return x(d, multiplier); }
> +  public long toKiB(long d) { return convert(d, KiB); }
> +  public long toMiB(long d) { return convert(d, MiB); }
> +  public long toGiB(long d) { return convert(d, GiB); }
> +  public long toTiB(long d) { return convert(d, TiB); }
> +  public long toPiB(long d) { return convert(d, PiB); }
> +
> +  long multiplier = 0;
> +  static final long MAX = Long.MAX_VALUE;


not used anywhere?

—
Reply to this email directly or view it on 
GitHub.


The information contained in this e-mail is confidential and/or proprietary 
to Capital One and/or its affiliates. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed.  If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA t

[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28901555
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1),
+  KiB (1024l),
+  MiB ((long) Math.pow(1024l, 2l)),
+  GiB ((long) Math.pow(1024l, 3l)),
+  TiB ((long) Math.pow(1024l, 4l)),
+  PiB ((long) Math.pow(1024l, 5l));
+
+  private ByteUnit(long multiplier) {
+this.multiplier = multiplier;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long interpret(long d, ByteUnit u) {
+return u.toBytes(d) / multiplier;  
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convert(long d, ByteUnit u) {
+return toBytes(d) / u.multiplier;
+  }
+
+  public long toBytes(long d) { return x(d, multiplier); }
+  public long toKiB(long d) { return convert(d, KiB); }
+  public long toMiB(long d) { return convert(d, MiB); }
+  public long toGiB(long d) { return convert(d, GiB); }
+  public long toTiB(long d) { return convert(d, TiB); }
+  public long toPiB(long d) { return convert(d, PiB); }
+  
+  long multiplier = 0;
+  static final long MAX = Long.MAX_VALUE;
+
+  /**
+   * Scale d by m, checking for overflow.
+   * This has a short name to make above code more readable.
--- End diff --

There's only one call to this method.

- Is it needed?
- If so, the cryptic name is not helping with anything.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28901488
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1),
+  KiB (1024l),
+  MiB ((long) Math.pow(1024l, 2l)),
+  GiB ((long) Math.pow(1024l, 3l)),
+  TiB ((long) Math.pow(1024l, 4l)),
+  PiB ((long) Math.pow(1024l, 5l));
+
+  private ByteUnit(long multiplier) {
+this.multiplier = multiplier;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long interpret(long d, ByteUnit u) {
+return u.toBytes(d) / multiplier;  
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convert(long d, ByteUnit u) {
+return toBytes(d) / u.multiplier;
+  }
+
+  public long toBytes(long d) { return x(d, multiplier); }
+  public long toKiB(long d) { return convert(d, KiB); }
+  public long toMiB(long d) { return convert(d, MiB); }
+  public long toGiB(long d) { return convert(d, GiB); }
+  public long toTiB(long d) { return convert(d, TiB); }
+  public long toPiB(long d) { return convert(d, PiB); }
+  
+  long multiplier = 0;
+  static final long MAX = Long.MAX_VALUE;
--- End diff --

not used anywhere?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28901411
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1),
+  KiB (1024l),
+  MiB ((long) Math.pow(1024l, 2l)),
+  GiB ((long) Math.pow(1024l, 3l)),
+  TiB ((long) Math.pow(1024l, 4l)),
+  PiB ((long) Math.pow(1024l, 5l));
+
+  private ByteUnit(long multiplier) {
+this.multiplier = multiplier;
+  }
+
+  // Interpret the provided number (d) with suffix (u) as this unit type.
+  // E.g. KiB.interpret(1, MiB) interprets 1MiB as its KiB representation 
= 1024k
+  public long interpret(long d, ByteUnit u) {
+return u.toBytes(d) / multiplier;  
+  }
+  
+  // Convert the provided number (d) interpreted as this unit type to unit 
type (u). 
+  public long convert(long d, ByteUnit u) {
+return toBytes(d) / u.multiplier;
+  }
+
+  public long toBytes(long d) { return x(d, multiplier); }
+  public long toKiB(long d) { return convert(d, KiB); }
+  public long toMiB(long d) { return convert(d, MiB); }
+  public long toGiB(long d) { return convert(d, GiB); }
+  public long toTiB(long d) { return convert(d, TiB); }
+  public long toPiB(long d) { return convert(d, PiB); }
+  
+  long multiplier = 0;
--- End diff --

private final


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28901391
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  BYTE (1),
+  KiB (1024l),
--- End diff --

super nit: `1234L` seems to be way more common in the code base.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28900776
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -1041,21 +1041,48 @@ private[spark] object Utils extends Logging {
   }
 
   /**
-   * Convert a Java memory parameter passed to -Xmx (such as 300m or 1g) 
to a number of megabytes.
+   * Convert a passed byte string (e.g. 50b, 100k, or 250m) to bytes for 
internal use.
+   *
+   * If no suffix is provided, the passed number is assumed to be in bytes.
+   */
+  def byteStringAsBytes(str: String): Long = {
+JavaUtils.byteStringAsBytes(str)
+  }
+
+  /**
+   * Convert a passed byte string (e.g. 50b, 100k, or 250m) to kibibytes 
for internal use.
+   *
+   * If no suffix is provided, the passed number is assumed to be in 
kibibytes.
+   */
+  def byteStringAsKb(str: String): Long = {
+JavaUtils.byteStringAsKb(str)
+  }
+
+  /**
+   * Convert a passed byte string (e.g. 50b, 100k, or 250m) to mebibytes 
for internal use.
+   *
+   * If no suffix is provided, the passed number is assumed to be in 
mebibytes.
+   */
+  def byteStringAsMb(str: String): Long = {
+JavaUtils.byteStringAsMb(str)
+  }
+
+  /**
+   * Convert a passed byte string (e.g. 50b, 100k, or 250m, 500g) to 
gibibytes for internal use.
+   *
+   * If no suffix is provided, the passed number is assumed to be in 
gibibytes.
+   */
+  def byteStringAsGb(str: String): Long = {
+JavaUtils.byteStringAsGb(str)
+  }
+
+  /**
+   * Convert a Java memory parameter passed to -Xmx (such as 300m or 1g) 
to a number of mebibytes.
*/
   def memoryStringToMb(str: String): Int = {
-val lower = str.toLowerCase
-if (lower.endsWith("k")) {
-  (lower.substring(0, lower.length-1).toLong / 1024).toInt
-} else if (lower.endsWith("m")) {
-  lower.substring(0, lower.length-1).toInt
-} else if (lower.endsWith("g")) {
-  lower.substring(0, lower.length-1).toInt * 1024
-} else if (lower.endsWith("t")) {
-  lower.substring(0, lower.length-1).toInt * 1024 * 1024
-} else {// no suffix, so it's just a number in bytes
-  (lower.toLong / 1024 / 1024).toInt
-}
+// Convert to bytes, rather than directly to MB, because when no units 
are specified the unit
+// is assumed to be bytes
+(JavaUtils.byteStringAsBytes(str) / 1024.0d / 1024.0d).toInt
--- End diff --

Why use floating point math here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28900589
  
--- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala ---
@@ -407,7 +474,13 @@ private[spark] object SparkConf extends Logging {
 "The spark.cache.class property is no longer being used! Specify 
storage levels using " +
 "the RDD.persist() method instead."),
   DeprecatedConfig("spark.yarn.user.classpath.first", "1.3",
-"Please use spark.{driver,executor}.userClassPathFirst instead."))
+"Please use spark.{driver,executor}.userClassPathFirst instead."),
+  DeprecatedConfig("spark.kryoserializer.buffer.mb", "1.4",
--- End diff --

You could have a translation function for this, no? Like 
`spark.yarn.applicationMaster.waitTries` below.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95289888
  
  [Test build #30774 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30774/consoleFull)
 for   PR 5574 at commit 
[`e428049`](https://github.com/apache/spark/commit/e428049e516a6b4c8bb13074d44098d8d7ded535).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread ilganeli
Github user ilganeli commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95289324
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95285766
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30769/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95285746
  
  [Test build #30769 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30769/consoleFull)
 for   PR 5574 at commit 
[`e428049`](https://github.com/apache/spark/commit/e428049e516a6b4c8bb13074d44098d8d7ded535).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95262778
  
  [Test build #30769 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30769/consoleFull)
 for   PR 5574 at commit 
[`e428049`](https://github.com/apache/spark/commit/e428049e516a6b4c8bb13074d44098d8d7ded535).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread ilganeli
Github user ilganeli commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95261583
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95258228
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30762/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95258108
  
  [Test build #30762 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30762/consoleFull)
 for   PR 5574 at commit 
[`e428049`](https://github.com/apache/spark/commit/e428049e516a6b4c8bb13074d44098d8d7ded535).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-95237301
  
  [Test build #30762 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30762/consoleFull)
 for   PR 5574 at commit 
[`e428049`](https://github.com/apache/spark/commit/e428049e516a6b4c8bb13074d44098d8d7ded535).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94979926
  
  [Test build #30706 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30706/consoleFull)
 for   PR 5574 at commit 
[`8b43748`](https://github.com/apache/spark/commit/8b43748145e442c7dabbe9b2d75f49f40f78ab8e).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94979941
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30706/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94962856
  
  [Test build #30706 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30706/consoleFull)
 for   PR 5574 at commit 
[`8b43748`](https://github.com/apache/spark/commit/8b43748145e442c7dabbe9b2d75f49f40f78ab8e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-21 Thread ilganeli
Github user ilganeli commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94962704
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94960599
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30705/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94960591
  
  [Test build #30705 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30705/consoleFull)
 for   PR 5574 at commit 
[`8b43748`](https://github.com/apache/spark/commit/8b43748145e442c7dabbe9b2d75f49f40f78ab8e).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94958363
  
  [Test build #30705 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30705/consoleFull)
 for   PR 5574 at commit 
[`8b43748`](https://github.com/apache/spark/commit/8b43748145e442c7dabbe9b2d75f49f40f78ab8e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94956321
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30694/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94956307
  
  [Test build #30694 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30694/consoleFull)
 for   PR 5574 at commit 
[`84a2581`](https://github.com/apache/spark/commit/84a25819ebfc57638f73403a4dc81fde173c9039).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94941428
  
  [Test build #30694 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30694/consoleFull)
 for   PR 5574 at commit 
[`84a2581`](https://github.com/apache/spark/commit/84a25819ebfc57638f73403a4dc81fde173c9039).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-21 Thread ilganeli
Github user ilganeli commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28821264
  
--- Diff: 
core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala ---
@@ -49,16 +49,17 @@ class KryoSerializer(conf: SparkConf)
   with Logging
   with Serializable {
 
-  private val bufferSizeMb = 
conf.getDouble("spark.kryoserializer.buffer.mb", 0.064)
-  if (bufferSizeMb >= 2048) {
-throw new IllegalArgumentException("spark.kryoserializer.buffer.mb 
must be less than " +
-  s"2048 mb, got: + $bufferSizeMb mb.")
+  private val bufferSizeKb = 
conf.getSizeAsKb("spark.kryoserializer.buffer", "64k")
--- End diff --

All - I don't know what the right solution here is. The old value simply 
can't work using the new framework. Fractional values are no longer supported 
and near as I can tell this is is the only instance of such usage. The only way 
to truly maintain backwards compatibility (short of throwing an exception) is 
to leave this as conf.getDouble but then this is an exception to the rule for 
how we handle size variables. 

This needs to be ```getSizeAsKb()``` and the value must be specified in the 
right format, otherwise an exception is thrown. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94928923
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30683/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94928910
  
**[Test build #30683 timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30683/consoleFull)**
 for PR 5574 at commit 
[`d3d09b6`](https://github.com/apache/spark/commit/d3d09b62db1a6480d14ec11ab3ce3359adf25cf5)
 after a configured wait of `150m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94888456
  
  [Test build #30683 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30683/consoleFull)
 for   PR 5574 at commit 
[`d3d09b6`](https://github.com/apache/spark/commit/d3d09b62db1a6480d14ec11ab3ce3359adf25cf5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94878949
  
**[Test build #30677 timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30677/consoleFull)**
 for PR 5574 at commit 
[`fe286b4`](https://github.com/apache/spark/commit/fe286b409ffbb85aad8c0e733d08754d61e129bd)
 after a configured wait of `150m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94878964
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30677/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94829108
  
  [Test build #30677 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30677/consoleFull)
 for   PR 5574 at commit 
[`fe286b4`](https://github.com/apache/spark/commit/fe286b409ffbb85aad8c0e733d08754d61e129bd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94657281
  
**[Test build #30631 timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30631/consoleFull)**
 for PR 5574 at commit 
[`c7803cd`](https://github.com/apache/spark/commit/c7803cd8bae1fbc69fbee9bdfc2ca540a8516b9c)
 after a configured wait of `150m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94657310
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30631/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5574#issuecomment-94634315
  
  [Test build #30631 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30631/consoleFull)
 for   PR 5574 at commit 
[`c7803cd`](https://github.com/apache/spark/commit/c7803cd8bae1fbc69fbee9bdfc2ca540a8516b9c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-20 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28742203
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/JavaUtils.java ---
@@ -137,6 +137,16 @@ private static boolean isSymlink(File file) throws 
IOException {
   .put("d", TimeUnit.DAYS)
   .build();
 
+  private static ImmutableMap byteSuffixes =
+ImmutableMap.builder()
+  .put("b", ByteUnit.BYTE)
+  .put("k", ByteUnit.KiB)
+  .put("m", ByteUnit.MiB)
+  .put("g", ByteUnit.GiB)
+  .put("t", ByteUnit.TiB)
+  .put("p", ByteUnit.PiB)
--- End diff --

I would also accept `kb, mb, gb` etc. I think these are pretty common short 
forms.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-20 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28742151
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/util/ByteUnit.java ---
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.util;
+
+public enum ByteUnit {
+  /** Byte (B) */
+  BYTE {
+public long toBytes(long d) { return d; }
+
+public long convert(long d, ByteUnit u) { return u.toBytes(d); }
+  },
+
+  /** Kibibyte (KiB) = 1024 Byte */
+  KiB {
+public long toBytes(long d) { return x(d, C_KiB); }
+
+public long convert(long d, ByteUnit u) { return u.toKiB(d); }
+  },
+
+  /** Mebibyte (MiB) = (1024^2) Byte */
+  MiB {
+public long toBytes(long d) { return x(d, C_MiB); }
+
+public long convert(long d, ByteUnit u) { return u.toMiB(d); }
+  },
+
+  /** Gibibyte (GiB) = (1024^3) Byte */
+  GiB {
+public long toBytes(long d) { return x(d, C_GiB);
+}
+
+public long convert(long d, ByteUnit u) { return u.toGiB(d); }
+  },
--- End diff --

style:
```
GiB {
  public long toBytes(...) { return ... };
  public long convert(...) { return ... };
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-20 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28742073
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -1037,21 +1037,52 @@ private[spark] object Utils extends Logging {
   }
 
   /**
-   * Convert a Java memory parameter passed to -Xmx (such as 300m or 1g) 
to a number of megabytes.
+   * Convert a passed byte string (e.g. 50b, 100k, or 250m) to bytes for
+   * internal use.
+   *
+   * If no suffix is provided, the passed number is assumed to be in bytes.
+   */
+  def byteStringAsBytes(str: String): Long = {
+JavaUtils.byteStringAsBytes(str)
+  }
+
+  /**
+   * Convert a passed byte string (e.g. 50b, 100k, or 250m) to kibibytes 
for
+   * internal use.
--- End diff --

nit: this could totally fit on the previous line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5932][CORE] Use consistent naming for s...

2015-04-20 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5574#discussion_r28742045
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -1037,21 +1037,52 @@ private[spark] object Utils extends Logging {
   }
 
   /**
-   * Convert a Java memory parameter passed to -Xmx (such as 300m or 1g) 
to a number of megabytes.
+   * Convert a passed byte string (e.g. 50b, 100k, or 250m) to bytes for
+   * internal use.
+   *
+   * If no suffix is provided, the passed number is assumed to be in bytes.
+   */
+  def byteStringAsBytes(str: String): Long = {
+JavaUtils.byteStringAsBytes(str)
+  }
+
+  /**
+   * Convert a passed byte string (e.g. 50b, 100k, or 250m) to kibibytes 
for
+   * internal use.
+   *
+   * If no suffix is provided, the passed number is assumed to be in 
kibibytes.
+   */
+  def byteStringAsKb(str: String): Long = {
+JavaUtils.byteStringAsKb(str)
+  }
+
+  /**
+   * Convert a passed byte string (e.g. 50b, 100k, or 250m) to mebibytes 
for
+   * internal use.
+   *
+   * If no suffix is provided, the passed number is assumed to be in 
mebibytes.
+   */
+  def byteStringAsMb(str: String): Long = {
+JavaUtils.byteStringAsMb(str)
+  }
+
+  /**
+   * Convert a passed byte string (e.g. 50b, 100k, or 250m, 500g) to 
gibibytes for
+   * internal use.
+   *
+   * If no suffix is provided, the passed number is assumed to be in 
gibibytes.
+   */
+  def byteStringAsGb(str: String): Long = {
+JavaUtils.byteStringAsGb(str)
+  }
+
+  /**
+   * Convert a Java memory parameter passed to -Xmx (such as 300m or 1g) 
to a number of mebibytes.
*/
   def memoryStringToMb(str: String): Int = {
-val lower = str.toLowerCase
-if (lower.endsWith("k")) {
-  (lower.substring(0, lower.length-1).toLong / 1024).toInt
-} else if (lower.endsWith("m")) {
-  lower.substring(0, lower.length-1).toInt
-} else if (lower.endsWith("g")) {
-  lower.substring(0, lower.length-1).toInt * 1024
-} else if (lower.endsWith("t")) {
-  lower.substring(0, lower.length-1).toInt * 1024 * 1024
-} else {// no suffix, so it's just a number in bytes
-  (lower.toLong / 1024 / 1024).toInt
-}
+// Convert to bytes, rather than directly to MB, because when no units 
are specified the unit
+// is assumed to be bytes
+(JavaUtils.byteStringAsBytes(str) / 1048576.0).toInt
--- End diff --

I would do `/ 1024 / 1024` so it's clearer


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >