date:20170719

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/12646
  
**[Test build #79788 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79788/testReport)**
 for PR 12646 at commit 
[`51ecfc8`](https://github.com/apache/spark/commit/51ecfc8e4acb0ffd6389726a8fa381dd040925a9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-07-19 Thread kevinyu98

Github user kevinyu98 commented on a diff in the pull request:

https://github.com/apache/spark/pull/12646#discussion_r128426729
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -2304,7 +2304,15 @@ object functions {
* @group string_funcs
* @since 1.5.0
*/
-  def ltrim(e: Column): Column = withExpr {StringTrimLeft(e.expr) }
+  def ltrim(e: Column): Column = withExpr {StringTrimLeft(e.expr)}
+
+  /**
+   * Trim the specified character string from left end for the specified 
string column.
+   * @group string_funcs
+   * @since 2.2.0
--- End diff --

sure


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18680: [SPARK-21472][SQL] Introduce ArrowColumnVector as a read...

2017-07-19 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/18680
  
@BryanCutler Thank you for reviewing!
As for scope, yes, I'd like these APIs to be public. Do you have any 
concerns about it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18680: [SPARK-21472][SQL] Introduce ArrowColumnVector as...

2017-07-19 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18680#discussion_r128425605
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowUtils.scala 
---
@@ -0,0 +1,109 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.arrow
+
+import scala.collection.JavaConverters._
+
+import org.apache.arrow.memory.RootAllocator
+import org.apache.arrow.vector.types.FloatingPointPrecision
+import org.apache.arrow.vector.types.pojo.{ArrowType, Field, FieldType, 
Schema}
+
+import org.apache.spark.sql.types._
+
+object ArrowUtils {
+
+  val rootAllocator = new RootAllocator(Long.MaxValue)
+
+  // todo: support more types.
+
+  def toArrowType(dt: DataType): ArrowType = dt match {
+case BooleanType => ArrowType.Bool.INSTANCE
+case ByteType => new ArrowType.Int(8, true)
+case ShortType => new ArrowType.Int(8 * 2, true)
+case IntegerType => new ArrowType.Int(8 * 4, true)
+case LongType => new ArrowType.Int(8 * 8, true)
+case FloatType => new 
ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+case DoubleType => new 
ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+case StringType => ArrowType.Utf8.INSTANCE
+case BinaryType => ArrowType.Binary.INSTANCE
+case DecimalType.Fixed(precision, scale) => new 
ArrowType.Decimal(precision, scale)
+case _ => throw new UnsupportedOperationException(s"Unsupported data 
type: ${dt.simpleString}")
+  }
+
+  def fromArrowType(dt: ArrowType): DataType = dt match {
+case ArrowType.Bool.INSTANCE => BooleanType
+case int: ArrowType.Int if int.getIsSigned && int.getBitWidth == 8 => 
ByteType
+case int: ArrowType.Int if int.getIsSigned && int.getBitWidth == 8 * 2 
=> ShortType
+case int: ArrowType.Int if int.getIsSigned && int.getBitWidth == 8 * 4 
=> IntegerType
+case int: ArrowType.Int if int.getIsSigned && int.getBitWidth == 8 * 8 
=> LongType
+case float: ArrowType.FloatingPoint
+  if float.getPrecision() == FloatingPointPrecision.SINGLE => FloatType
+case float: ArrowType.FloatingPoint
+  if float.getPrecision() == FloatingPointPrecision.DOUBLE => 
DoubleType
+case ArrowType.Utf8.INSTANCE => StringType
+case ArrowType.Binary.INSTANCE => BinaryType
+case d: ArrowType.Decimal => DecimalType(d.getPrecision, d.getScale)
+case _ => throw new UnsupportedOperationException(s"Unsupported data 
type: $dt")
+  }
+
+  def toArrowField(name: String, dt: DataType, nullable: Boolean): Field = 
{
--- End diff --

No, this is used to create an Arrow schema from `StructType` in `ArrowUtils 
.toArrowSchema()`, too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18680: [SPARK-21472][SQL] Introduce ArrowColumnVector as...

2017-07-19 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18680#discussion_r128425637
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ReadOnlyColumnVector.java
 ---
@@ -0,0 +1,250 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.vectorized;
+
+import org.apache.spark.memory.MemoryMode;
+import org.apache.spark.sql.types.*;
+
+/**
+ * An abstract class for read-only column vector.
+ */
+public abstract class ReadOnlyColumnVector extends ColumnVector {
--- End diff --

I agree that it'd be better to refactor `ColumnVector`, but I think 
`ColumnVector` is related to `ColumnarBatch` or other classes, so we should do 
it, and also refactor `ColumnarBatch` at the same time, in the future PRs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18680: [SPARK-21472][SQL] Introduce ArrowColumnVector as...

2017-07-19 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18680#discussion_r128425617
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ArrowColumnVector.java
 ---
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.vectorized;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.complex.*;
+import org.apache.arrow.vector.holders.NullableVarCharHolder;
+
+import org.apache.spark.memory.MemoryMode;
+import org.apache.spark.sql.execution.arrow.ArrowUtils;
+import org.apache.spark.sql.types.*;
+import org.apache.spark.unsafe.types.UTF8String;
+
+/**
+ * A column backed by Apache Arrow.
+ */
+public final class ArrowColumnVector extends ReadOnlyColumnVector {
+
+  private final ArrowVectorAccessor accessor;
+
+  @Override
+  public long nullsNativeAddress() {
+throw new RuntimeException("Cannot get native address for arrow 
column");
+  }
+
+  @Override
+  public long valuesNativeAddress() {
+throw new RuntimeException("Cannot get native address for arrow 
column");
+  }
+
+  @Override
+  public void close() {
+if (childColumns != null) {
+  for (int i = 0; i < childColumns.length; i++) {
+childColumns[i].close();
+  }
+}
+accessor.close();
+  }
+
+  //
+  // APIs dealing with nulls
+  //
+
+  @Override
+  public boolean isNullAt(int rowId) {
+return accessor.isNullAt(rowId);
+  }
+
+  //
+  // APIs dealing with Booleans
+  //
+
+  @Override
+  public boolean getBoolean(int rowId) {
+return accessor.getBoolean(rowId);
+  }
+
+  @Override
+  public boolean[] getBooleans(int rowId, int count) {
+boolean[] array = new boolean[count];
+for (int i = 0; i < count; ++i) {
+  array[i] = accessor.getBoolean(rowId + i);
+}
+return array;
+  }
+
+  //
+  // APIs dealing with Bytes
+  //
+
+  @Override
+  public byte getByte(int rowId) {
+return accessor.getByte(rowId);
+  }
+
+  @Override
+  public byte[] getBytes(int rowId, int count) {
+byte[] array = new byte[count];
+for (int i = 0; i < count; ++i) {
+  array[i] = accessor.getByte(rowId + i);
+}
+return array;
+  }
+
+  //
+  // APIs dealing with Shorts
+  //
+
+  @Override
+  public short getShort(int rowId) {
+return accessor.getShort(rowId);
+  }
+
+  @Override
+  public short[] getShorts(int rowId, int count) {
+short[] array = new short[count];
+for (int i = 0; i < count; ++i) {
+  array[i] = accessor.getShort(rowId + i);
+}
+return array;
+  }
+
+  //
+  // APIs dealing with Ints
+  //
+
+  @Override
+  public int getInt(int rowId) {
+return accessor.getInt(rowId);
+  }
+
+  @Override
+  public int[] getInts(int rowId, int count) {
+int[] array = new int[count];
+for (int i = 0; i < count; ++i) {
+  array[i] = accessor.getInt(rowId + i);
+}
+return array;
+  }
+
+  @Override
+  public int getDictId(int rowId) {
+throw new UnsupportedOperationException();
+  }
+
+  //
+  // APIs dealing with Longs
+  //
+
+  @Override
+  public long getLong(int rowId) {
+return accessor.getLong(rowId);
+  }
+
+  @Override
+  public long[] getLongs(int rowId, int count) {
+long[] array = new long[count];
+for (int i = 0; i < count; ++i) {
+  array[i] = accessor.getLong(rowId + i);
+}
+return array;
+  }
+
+  //
+  // APIs dealing with floats
+  //
+
+  @Override
+  public float getFloat(int rowId) {
+return accessor.getFloat(rowId);
+  }
+
+

[GitHub] spark issue #18468: [SPARK-20783][SQL] Create CachedBatchColumnVector to abs...

2017-07-19 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/18468
  
cc: @cloud-fan and @ueshin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18652: [WIP] Pull non-deterministic joining keys from Join oper...

2017-07-19 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/18652
  
I just checked with Hive's behavior (2.1.0). I tried a query like `select * 
from l left outer join r on rand(l.a) > 0.1 and rand(cast(l.b as int)) > 0.2 
and rand(r.c) > 0.2 and rand(cast(r.d as int)) > 0.5;`.

The conditions `rand(r.c) > 0.2 and rand(cast(r.d as int)) > 0.5` are 
pushed down to Filter operator.

  TableScan
alias: r
Statistics: Num rows: 2 Data size: 10 Basic stats: COMPLETE 
Column stats: NONE
Select Operator
  expressions: c (type: int), d (type: double)
  outputColumnNames: _col0, _col1
  Statistics: Num rows: 2 Data size: 10 Basic stats: COMPLETE 
Column stats: NONE
  Filter Operator
predicate: ((rand(UDFToInteger(_col1)) > 0.5) and 
(rand(_col0) > 0.2)) (type: boolean)
Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE 
Column stats: NONE
HashTable Sink Operator
  filter predicates:
0 {(rand(_col0) > 0.1)} {(rand(UDFToInteger(_col1)) > 
0.2)}
1 
  keys:
0 

The other conditions `rand(l.a) > 0.1 and rand(cast(l.b as int)) > 0.2` are 
filter predicates with the Join operator.

  Map Join Operator
condition map:
 Left Outer Join0 to 1
filter predicates:
  0 {(rand(_col0) > 0.1)} {(rand(UDFToInteger(_col1)) > 
0.2)}
  1 
keys:
  0 
  1 

A query `select * from l left outer join r on rand(l.a) = rand(r.c);` with 
non-deterministic joining keys. There's no push down. Hive simply evaluates the 
joining keys.

  Map Join Operator
condition map:
 Left Outer Join0 to 1
keys:
  0 rand(_col0) (type: double)
  1 rand(_col0) (type: double)
outputColumnNames: _col0, _col1, _col2, _col3




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18185: [SPARK-20962][SQL] Support subquery column aliases in FR...

2017-07-19 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18185
  
Will review it tonight. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18388
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18388
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79785/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18680: [SPARK-21472][SQL] Introduce ArrowColumnVector as a read...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18680
  
**[Test build #79787 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79787/testReport)**
 for PR 18680 at commit 
[`91b94ef`](https://github.com/apache/spark/commit/91b94ef6d08771fe8e5eb5d41f43153af9a75f06).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18388
  
**[Test build #79785 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79785/testReport)**
 for PR 18388 at commit 
[`7dd2cec`](https://github.com/apache/spark/commit/7dd2cec311189feb555f3cfdbb27b29676efc18b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18680: [SPARK-21472][SQL] Introduce ArrowColumnVector as...

2017-07-19 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18680#discussion_r128422124
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ReadOnlyColumnVector.java
 ---
@@ -0,0 +1,250 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.vectorized;
+
+import org.apache.spark.memory.MemoryMode;
+import org.apache.spark.sql.types.*;
+
+/**
+ * An abstract class for read-only column vector.
+ */
+public abstract class ReadOnlyColumnVector extends ColumnVector {
+
+  protected ReadOnlyColumnVector(int capacity, MemoryMode memMode) {
--- End diff --

I see, I'll modify it to accept `dataType` but I guess we shouldn't pass it 
to `ColumnVector` to avoid illegally allocating child columns.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18550: [Minor][SS][DOCS] Minor doc change for kafka integration

2017-07-19 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/18550
  
ping @tdas Please take a look for this simple doc change. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types ...

2017-07-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18444


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-19 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/18444
  
Thanks! merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18655
  
**[Test build #79786 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79786/testReport)**
 for PR 18655 at commit 
[`7084b38`](https://github.com/apache/spark/commit/7084b388d87c8347b79898827658d7827bf5649d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/12646
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79782/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/12646
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/12646
  
**[Test build #79782 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79782/testReport)**
 for PR 12646 at commit 
[`9bb80ea`](https://github.com/apache/spark/commit/9bb80eaf8e0b4339850d8c48e221c8ad1e477552).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18468: [SPARK-20783][SQL] Create CachedBatchColumnVector to abs...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18468
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18468: [SPARK-20783][SQL] Create CachedBatchColumnVector to abs...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18468
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79783/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18468: [SPARK-20783][SQL] Create CachedBatchColumnVector to abs...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18468
  
**[Test build #79783 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79783/testReport)**
 for PR 18468 at commit 
[`4b4e281`](https://github.com/apache/spark/commit/4b4e2812d250d3d46fdbcd29c3e66964ea6dd345).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18607: [SPARK-21362][SQL][Adding Apache Drill JDBC Diale...

2017-07-19 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18607#discussion_r128414516
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/ApacheDrillDialect.scala ---
@@ -0,0 +1,31 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.jdbc
+
+import java.sql.Types
+
+import org.apache.spark.sql.types.{BooleanType, DataType, LongType, 
MetadataBuilder}
--- End diff --

ditto.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18607: [SPARK-21362][SQL][Adding Apache Drill JDBC Diale...

2017-07-19 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18607#discussion_r128414496
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/ApacheDrillDialect.scala ---
@@ -0,0 +1,31 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.jdbc
+
+import java.sql.Types
--- End diff --

Do we use this import?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16924: [SPARK-19531] Send UPDATE_LENGTH for Spark History servi...

2017-07-19 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16924
  
In `FsHistoryProvider`, since there is a check for file size, I think it is 
designed to find updated logs from running applications?


https://github.com/apache/spark/blob/e26dac5feb02033f980b1e69c9b0ff50869b6f9e/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala#L331

Btw, in `EventLoggingListener`, if it writes to local files, seems to me 
the file length will be updated and so `FsHistoryProvider` can find the updated 
logs.

Seems this looks reasonable to make the non local fs functionally 
consistent with local fs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18576: [SPARK-21351][SQL] Update nullability based on children'...

2017-07-19 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/18576
  
ping


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18185: [SPARK-20962][SQL] Support subquery column aliases in FR...

2017-07-19 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/18185
  
ping


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18487: [SPARK-21243][Core] Limit no. of map outputs in a...

2017-07-19 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18487#discussion_r128410279
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala 
---
@@ -443,12 +459,57 @@ final class ShuffleBlockFetcherIterator(
   }
 
   private def fetchUpToMaxBytes(): Unit = {
-// Send fetch requests up to maxBytesInFlight
-while (fetchRequests.nonEmpty &&
-  (bytesInFlight == 0 ||
-(reqsInFlight + 1 <= maxReqsInFlight &&
-  bytesInFlight + fetchRequests.front.size <= maxBytesInFlight))) {
-  sendRequest(fetchRequests.dequeue())
+// Send fetch requests up to maxBytesInFlight. If you cannot fetch 
from a remote host
+// immediately, defer the request until the next time it can be 
processed.
+
+// Process any outstanding deferred fetch requests if possible.
+if (deferredFetchRequests.nonEmpty) {
+  for ((remoteAddress, defReqQueue) <- deferredFetchRequests) {
+while (isRemoteBlockFetchable(defReqQueue) &&
+!isRemoteAddressMaxedOut(remoteAddress, defReqQueue.front)) {
+  val request = defReqQueue.dequeue()
+  logDebug(s"Processing deferred fetch request for $remoteAddress 
with "
++ s"${request.blocks.length} blocks")
+  send(remoteAddress, request)
+  if (defReqQueue.isEmpty) {
+deferredFetchRequests -= remoteAddress
+  }
+}
+  }
+}
+
+// Process any regular fetch requests if possible.
+while (isRemoteBlockFetchable(fetchRequests)) {
+  val request = fetchRequests.dequeue()
+  val remoteAddress = request.address
+  if (isRemoteAddressMaxedOut(remoteAddress, request)) {
+logDebug(s"Deferring fetch request for $remoteAddress with 
${request.blocks.size} blocks")
+val defReqQueue = deferredFetchRequests.getOrElse(remoteAddress, 
new Queue[FetchRequest]())
+defReqQueue.enqueue(request)
+deferredFetchRequests(remoteAddress) = defReqQueue
--- End diff --

the `defReqQueue` is mutable, so we don't need to do this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18487: [SPARK-21243][Core] Limit no. of map outputs in a...

2017-07-19 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18487#discussion_r128410233
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala 
---
@@ -443,12 +459,57 @@ final class ShuffleBlockFetcherIterator(
   }
 
   private def fetchUpToMaxBytes(): Unit = {
-// Send fetch requests up to maxBytesInFlight
-while (fetchRequests.nonEmpty &&
-  (bytesInFlight == 0 ||
-(reqsInFlight + 1 <= maxReqsInFlight &&
-  bytesInFlight + fetchRequests.front.size <= maxBytesInFlight))) {
-  sendRequest(fetchRequests.dequeue())
+// Send fetch requests up to maxBytesInFlight. If you cannot fetch 
from a remote host
+// immediately, defer the request until the next time it can be 
processed.
+
+// Process any outstanding deferred fetch requests if possible.
+if (deferredFetchRequests.nonEmpty) {
+  for ((remoteAddress, defReqQueue) <- deferredFetchRequests) {
+while (isRemoteBlockFetchable(defReqQueue) &&
+!isRemoteAddressMaxedOut(remoteAddress, defReqQueue.front)) {
+  val request = defReqQueue.dequeue()
+  logDebug(s"Processing deferred fetch request for $remoteAddress 
with "
++ s"${request.blocks.length} blocks")
+  send(remoteAddress, request)
+  if (defReqQueue.isEmpty) {
+deferredFetchRequests -= remoteAddress
--- End diff --

we can leave the empty queue here, as we may still have fetch requests to 
put in this queue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18487: [SPARK-21243][Core] Limit no. of map outputs in a...

2017-07-19 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18487#discussion_r128409414
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala 
---
@@ -375,6 +390,7 @@ final class ShuffleBlockFetcherIterator(
   result match {
 case r @ SuccessFetchResult(blockId, address, size, buf, 
isNetworkReqDone) =>
   if (address != blockManager.blockManagerId) {
+numBlocksInFlightPerAddress(address) = 
numBlocksInFlightPerAddress(address) - 1
--- End diff --

can we do this earlier? e.g. right after the fetch result is enqueued to 
`results`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18487: [SPARK-21243][Core] Limit no. of map outputs in a...

2017-07-19 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18487#discussion_r128408952
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -321,6 +321,17 @@ package object config {
   .intConf
   .createWithDefault(3)
 
+  private[spark] val REDUCER_MAX_BLOCKS_IN_FLIGHT_PER_ADDRESS =
+ConfigBuilder("spark.reducer.maxBlocksInFlightPerAddress")
+  .doc("This configuration limits the number of remote blocks being 
fetched per reduce task" +
+" from a given host port. When a large number of blocks are being 
requested from a given" +
+" address in a single fetch or simultaneously, this could crash 
the serving executor or" +
+" Node Manager. This is especially useful to reduce the load on 
the Node Manager when" +
--- End diff --

shall we say `shuffle service` instead of `Node Manager`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18503: [SPARK-21271][SQL] Ensure Unsafe.sizeInBytes is a...

2017-07-19 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/18503#discussion_r128408940
  
--- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
 ---
@@ -167,6 +167,7 @@ public UnsafeRow() {}
*/
   public void pointTo(Object baseObject, long baseOffset, int sizeInBytes) 
{
 assert numFields >= 0 : "numFields (" + numFields + ") should >= 0";
+assert sizeInBytes % 8 == 0 : "sizeInBytes (" + sizeInBytes + ") 
should be a multiple of 8";
--- End diff --

Yes, done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18503: [SPARK-21271][SQL] Ensure Unsafe.sizeInBytes is a...

2017-07-19 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/18503#discussion_r128408918
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala ---
@@ -479,6 +479,61 @@ class StreamSuite extends StreamTest {
   CheckAnswer((1, 2), (2, 2), (3, 2)))
   }
 
+  testQuietly("store to and recover from a checkpoint") {
--- End diff --

Ah, you are right. This test currently relies on internal assert at 
`Unsafe.pointTo` for checking a multiple of 8 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-19 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/18487
  
@rxin it's kind of a stability fix(make shuffle service more stable), so 
I'm ok to backport if the conflict is small.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18503: [SPARK-21271][SQL] Ensure Unsafe.sizeInBytes is a...

2017-07-19 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/18503#discussion_r128408410
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala
 ---
@@ -363,7 +363,8 @@ private[state] class HDFSBackedStateStoreProvider 
extends StateStoreProvider wit
 val valueRowBuffer = new Array[Byte](valueSize)
 ByteStreams.readFully(input, valueRowBuffer, 0, valueSize)
 val valueRow = new UnsafeRow(valueSchema.fields.length)
-valueRow.pointTo(valueRowBuffer, valueSize)
+// If valueSize in existing file is not multiple of 8, round 
it down to multiple of 8
+valueRow.pointTo(valueRowBuffer, (valueSize / 8) * 8)
--- End diff --

This isnt rounding. This essentially floor to the multiple of 8. 
@cloud-fan  is this safe to do with ANY row generated in earlier Spark 2.0 
- 2.2? I want to be 100% sure.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18503: [SPARK-21271][SQL] Ensure Unsafe.sizeInBytes is a...

2017-07-19 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/18503#discussion_r128408166
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala ---
@@ -479,6 +479,61 @@ class StreamSuite extends StreamTest {
   CheckAnswer((1, 2), (2, 2), (3, 2)))
   }
 
+  testQuietly("store to and recover from a checkpoint") {
--- End diff --

It does not really check it explicitly .. does it? It tests it implicitly 
by creating checkpoints and then restarting. There are other tests that already 
do the same thing. E.g. This test is effectively same as 

https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingAggregationSuite.scala#L88


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18281: [SPARK-21027][ML][PYTHON] Added tunable parallelism to o...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18281
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18281: [SPARK-21027][ML][PYTHON] Added tunable parallelism to o...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18281
  
**[Test build #79784 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79784/testReport)**
 for PR 18281 at commit 
[`ce14172`](https://github.com/apache/spark/commit/ce14172711b51a4321ed02a3cf8450a54374d4f5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18281: [SPARK-21027][ML][PYTHON] Added tunable parallelism to o...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18281
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79784/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18388
  
**[Test build #79785 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79785/testReport)**
 for PR 18388 at commit 
[`7dd2cec`](https://github.com/apache/spark/commit/7dd2cec311189feb555f3cfdbb27b29676efc18b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18664
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79781/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18664
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18664
  
**[Test build #79781 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79781/testReport)**
 for PR 18664 at commit 
[`b709d78`](https://github.com/apache/spark/commit/b709d78c03701f92f617651879ee33dada0c4da1).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17848: [SPARK-20586] [SQL] Add deterministic to ScalaUDF and Ja...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17848
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17848: [SPARK-20586] [SQL] Add deterministic to ScalaUDF and Ja...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17848
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79780/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17848: [SPARK-20586] [SQL] Add deterministic to ScalaUDF and Ja...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17848
  
**[Test build #79780 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79780/testReport)**
 for PR 17848 at commit 
[`0ea4691`](https://github.com/apache/spark/commit/0ea4691d3ea979b86cb7c44f8290ff7dc805a8a7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18029: [SPARK-20168][DStream] Add changes to use kinesis fetche...

2017-07-19 Thread yssharma

Github user yssharma commented on the issue:

https://github.com/apache/spark/pull/18029
  
@budde @brkyvz - could I get some love here please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18281: [SPARK-21027][ML][PYTHON] Added tunable parallelism to o...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18281
  
**[Test build #79784 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79784/testReport)**
 for PR 18281 at commit 
[`ce14172`](https://github.com/apache/spark/commit/ce14172711b51a4321ed02a3cf8450a54374d4f5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18468: [SPARK-20783][SQL] Create CachedBatchColumnVector to abs...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18468
  
**[Test build #79783 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79783/testReport)**
 for PR 18468 at commit 
[`4b4e281`](https://github.com/apache/spark/commit/4b4e2812d250d3d46fdbcd29c3e66964ea6dd345).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18607: [SPARK-21362][SQL][Adding Apache Drill JDBC Dialect]

2017-07-19 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18607
  
Could you also add the docker-based test suite, like what we did in 
https://github.com/apache/spark/pull/9893/files?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/12646
  
**[Test build #79782 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79782/testReport)**
 for PR 12646 at commit 
[`9bb80ea`](https://github.com/apache/spark/commit/9bb80eaf8e0b4339850d8c48e221c8ad1e477552).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-07-19 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/12646#discussion_r128398109
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -2304,7 +2304,15 @@ object functions {
* @group string_funcs
* @since 1.5.0
*/
-  def ltrim(e: Column): Column = withExpr {StringTrimLeft(e.expr) }
+  def ltrim(e: Column): Column = withExpr {StringTrimLeft(e.expr)}
+
+  /**
+   * Trim the specified character string from left end for the specified 
string column.
+   * @group string_funcs
+   * @since 2.2.0
--- End diff --

Update the versions to 2.3.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-07-19 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/12646
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18686: [SPARK-21477] [SQL] [MINOR] Mark LocalTableScanExec's in...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18686
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18686: [SPARK-21477] [SQL] [MINOR] Mark LocalTableScanExec's in...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18686
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79777/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18686: [SPARK-21477] [SQL] [MINOR] Mark LocalTableScanExec's in...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18686
  
**[Test build #79777 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79777/testReport)**
 for PR 18686 at commit 
[`662d377`](https://github.com/apache/spark/commit/662d377ebcf8c62afa87cabaa6bfd4cd77fb9630).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-07-19 Thread BryanCutler

Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/18664
  
@ueshin @holdenk I think I'm seeing an issue with transferring timestamp 
data to Pandas with Arrow, so I'll try to explain.  Spark will assume the 
timestamp is in local time, so when converting to internal data, it will adjust 
with an offset to UTC from local timezone.  Currently, the internal data is 
converted to Arrow without a timezone, which Arrow takes as timezone unaware.  
When a Pandas DataFrame is created from that data, it does not adjust to local 
time, so a different timestamp is shown.  For my case below, using PST as local 
time, it will add 8 hours.

```

In [2]: dt = datetime.datetime(1970, 1, 1, 0, 0, 1)

In [5]: TimestampType().toInternal(dt)
Out[5]: 2880100

In [8]: df = spark.createDataFrame([(dt,)], 
schema=StructType([StructField("ts", TimestampType(), True)]))

In [7]: df.show()
+---+
| ts|
+---+
|1970-01-01 00:00:01|
+---+

In [9]: spark.conf.set("spark.sql.execution.arrow.enable", "true")

In [10]: df.toPandas()
Out[10]: 
   ts
0 1970-01-01 08:00:01

In [11]: spark.conf.set("spark.sql.execution.arrow.enable", "false")

In [12]: df.toPandas()
Out[12]: 
  ts
0 1970-01-01 00:00:01
```

It wasn't a problem before Arrow because the data gets converted before 
going into Pandas.  I believe there are a few different ways to handle this

1) Adjust the Spark internal data to represent local time, not UTC time, 
and create an Arrow field without specifying the timezone.

2) Give the Arrow field the timezone from `DateTimeUtils.defaultTimeZone()` 
and adjust the internal data to represent local time, not UTC time.

3) Give the Arrow field a "UTC" timezone, then no adjustments need to be 
done to the internal data but I think Pandas will still display as UTC and it 
would be up to the user to change timezone.

I'm not sure what the best solution is because there could be issues with 
them all, any thoughts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18444
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18444
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79774/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18444
  
**[Test build #79774 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79774/testReport)**
 for PR 18444 at commit 
[`a340745`](https://github.com/apache/spark/commit/a3407459405c2a5b3c7539d5075853e65c80f9cd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18664
  
**[Test build #79781 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79781/testReport)**
 for PR 18664 at commit 
[`b709d78`](https://github.com/apache/spark/commit/b709d78c03701f92f617651879ee33dada0c4da1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18684: [SPARK-21475][Core] Use NIO's Files API to replace FileI...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18684
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79771/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18684: [SPARK-21475][Core] Use NIO's Files API to replace FileI...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18684
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18684: [SPARK-21475][Core] Use NIO's Files API to replace FileI...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18684
  
**[Test build #79771 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79771/testReport)**
 for PR 18684 at commit 
[`f2d534a`](https://github.com/apache/spark/commit/f2d534a1693c31138b464ed1094dc05888cdc3d0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18676: [SPARK-21463] Allow userSpecifiedSchema to overri...

2017-07-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18676


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18676: [SPARK-21463] Allow userSpecifiedSchema to override part...

2017-07-19 Thread brkyvz

Github user brkyvz commented on the issue:

https://github.com/apache/spark/pull/18676
  
Thanks! Merging to master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-19 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/18487
  
hm is this a bug fix? if not we shouldn't cherry pick it.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18462: [SPARK-21333][Docs] Removed invalid joinTypes fro...

2017-07-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18462


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18462: [SPARK-21333][Docs] Removed invalid joinTypes from javad...

2017-07-19 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18462
  
Thanks! Merging to master/2.2


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18674: [SPARK-21456][MESOS] Make the driver failover_tim...

2017-07-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18674


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18674: [SPARK-21456][MESOS] Make the driver failover_timeout co...

2017-07-19 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/18674
  
Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18674: [SPARK-21456][MESOS] Make the driver failover_timeout co...

2017-07-19 Thread susanxhuynh

Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/18674
  
@vanzin Thanks for the review. I have made the changes you recommended 
(documenting the zero default value and using the config key).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17848: [SPARK-20586] [SQL] Add deterministic to ScalaUDF and Ja...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17848
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79779/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17848: [SPARK-20586] [SQL] Add deterministic to ScalaUDF and Ja...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17848
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17848: [SPARK-20586] [SQL] Add deterministic to ScalaUDF and Ja...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17848
  
**[Test build #79779 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79779/testReport)**
 for PR 17848 at commit 
[`43bb9a9`](https://github.com/apache/spark/commit/43bb9a9254d0d694b2be57ec6a3574d53e9c3141).
 * This patch **fails to generate documentation**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18674: [SPARK-21456][MESOS] Make the driver failover_timeout co...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18674
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79778/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18674: [SPARK-21456][MESOS] Make the driver failover_timeout co...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18674
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18674: [SPARK-21456][MESOS] Make the driver failover_timeout co...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18674
  
**[Test build #79778 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79778/testReport)**
 for PR 18674 at commit 
[`f4a001f`](https://github.com/apache/spark/commit/f4a001faa612655c6c2aa7a7da85248be862241a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17848: [SPARK-20586] [SQL] Add deterministic to ScalaUDF and Ja...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17848
  
**[Test build #79780 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79780/testReport)**
 for PR 17848 at commit 
[`0ea4691`](https://github.com/apache/spark/commit/0ea4691d3ea979b86cb7c44f8290ff7dc805a8a7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17848: [SPARK-20586] [SQL] Add deterministic to ScalaUDF and Ja...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17848
  
**[Test build #79779 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79779/testReport)**
 for PR 17848 at commit 
[`43bb9a9`](https://github.com/apache/spark/commit/43bb9a9254d0d694b2be57ec6a3574d53e9c3141).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18665: [SPARK-21446] [SQL] Fix setAutoCommit never execu...

2017-07-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18665


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18674: [SPARK-21456][MESOS] Make the driver failover_timeout co...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18674
  
**[Test build #79778 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79778/testReport)**
 for PR 18674 at commit 
[`f4a001f`](https://github.com/apache/spark/commit/f4a001faa612655c6c2aa7a7da85248be862241a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18665: [SPARK-21446] [SQL] Fix setAutoCommit never executed

2017-07-19 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18665
  
Thanks! Merging to master/2.2/2.1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18676: [SPARK-21463] Allow userSpecifiedSchema to override part...

2017-07-19 Thread zsxwing

Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/18676
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17848: [SPARK-20586] [SQL] Add deterministic to ScalaUDF and Ja...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17848
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17848: [SPARK-20586] [SQL] Add deterministic to ScalaUDF and Ja...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17848
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79776/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17848: [SPARK-20586] [SQL] Add deterministic to ScalaUDF and Ja...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17848
  
**[Test build #79776 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79776/testReport)**
 for PR 17848 at commit 
[`d0a9086`](https://github.com/apache/spark/commit/d0a90865ca7c6a9afd6fbb28b3e8d1c9c602013c).
 * This patch **fails to generate documentation**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18684: [SPARK-21475][Core] Use NIO's Files API to replace FileI...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18684
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79769/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18686: [SQL] [MINOR] Mark LocalTableScanExec's input data trans...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18686
  
**[Test build #79777 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79777/testReport)**
 for PR 18686 at commit 
[`662d377`](https://github.com/apache/spark/commit/662d377ebcf8c62afa87cabaa6bfd4cd77fb9630).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18684: [SPARK-21475][Core] Use NIO's Files API to replace FileI...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18684
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18686: [SQL] [MINOR] Mark LocalTableScanExec's input data trans...

2017-07-19 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18686
  
cc @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18684: [SPARK-21475][Core] Use NIO's Files API to replace FileI...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18684
  
**[Test build #79769 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79769/testReport)**
 for PR 18684 at commit 
[`b9dad5a`](https://github.com/apache/spark/commit/b9dad5ac976261359623fafbbfa9389310272238).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18686: [SQL] [MINOR] Mark LocalTableScanExec's input dat...

2017-07-19 Thread gatorsmile

GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/18686

[SQL] [MINOR] Mark LocalTableScanExec's input data transient

## What changes were proposed in this pull request?
This PR is to mark the parameter `rows` and `unsafeRow` transient. It can 
avoid serializing the unneeded objects.

## How was this patch tested?
N/A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark LocalTableScanExec

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18686.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18686






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18503: [SPARK-21271][SQL] Ensure Unsafe.sizeInBytes is a multip...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18503
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18503: [SPARK-21271][SQL] Ensure Unsafe.sizeInBytes is a multip...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18503
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79770/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18503: [SPARK-21271][SQL] Ensure Unsafe.sizeInBytes is a multip...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18503
  
**[Test build #79770 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79770/testReport)**
 for PR 18503 at commit 
[`762f02a`](https://github.com/apache/spark/commit/762f02a2c9211ab953a2dc4b2d9938911f2e883d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17848: [SPARK-20586] [SQL] Add deterministic to ScalaUDF and Ja...

2017-07-19 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17848
  
**[Test build #79776 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79776/testReport)**
 for PR 17848 at commit 
[`d0a9086`](https://github.com/apache/spark/commit/d0a90865ca7c6a9afd6fbb28b3e8d1c9c602013c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18281: [SPARK-21027][ML][PYTHON] Added tunable parallelism to o...

2017-07-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18281
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 >

1 - 100 of 424 matches

Mail list logo