[GitHub] spark issue #14579: [SPARK-16921][PYSPARK] RDD/DataFrame persist()/cache() s...

2016-08-10 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/14579
  
Right I wouldn't expect it to error with subclassing - just not pipeline 
successfully - but only in a very long shot corner case.

I think the try/finally with persistance is not an uncommon pattern (we 
have something similar happen frequently inside of Spark ML/mllib but its in 
Scala code).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-10 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14580
  
Oracle supports it... 
http://docs.oracle.com/javadb/10.10.1.2/ref/rrefsqljusing.html 




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #12004: [SPARK-7481][build] [WIP] Add Hadoop 2.6+ spark-cloud mo...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/12004
  
**[Test build #63588 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63588/consoleFull)**
 for PR 12004 at commit 
[`cb07c1d`](https://github.com/apache/spark/commit/cb07c1d7b79944059e477b0b615ce061b08cef00).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14590: [SPARK-17008][SPARK-17009][SQL] Normalization and isolat...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14590
  
**[Test build #3216 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3216/consoleFull)**
 for PR 14590 at commit 
[`e061820`](https://github.com/apache/spark/commit/e0618203c317f8b8211c0e983403834f8e39a950).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14454: [Minor] [ML] Rename TreeEnsembleModels to TreeEnsembleMo...

2016-08-10 Thread yanboliang
Github user yanboliang commented on the issue:

https://github.com/apache/spark/pull/14454
  
ping @srowen 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14559: [SPARK-16968]Add additional options in jdbc when creatin...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14559
  
**[Test build #63587 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63587/consoleFull)**
 for PR 14559 at commit 
[`57be055`](https://github.com/apache/spark/commit/57be055c542d1720bb9fd57810d4c2593444).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14591: [SPARK-17010][MINOR][DOC]Wrong description in mem...

2016-08-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14591


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14559: [SPARK-16968]Add additional options in jdbc when creatin...

2016-08-10 Thread GraceH
Github user GraceH commented on the issue:

https://github.com/apache/spark/pull/14559
  
@HyukjinKwon and @srowen, here is the initial proposal. Please let me know 
your comment. I will refine that with unit test later.

BTW, the readwriter.py calls high level api of jdbc(url, table, 
connectionProperties). If we don't change that API like reader api does, we may 
not need to expose the JDBCOptions in that file. What do you think? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14592: [SPARK-17011][SQL] Support testing exceptions in SQLQuer...

2016-08-10 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14592
  
LGTM pending Jenkins.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14591: [SPARK-17010][MINOR][DOC]Wrong description in memory man...

2016-08-10 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14591
  
Merging in master/2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14592: [SPARK-17011][SQL] Support testing exceptions in ...

2016-08-10 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14592#discussion_r74370013
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/number-format.sql.out ---
@@ -19,16 +19,24 @@ struct<2147483648:bigint,(-2147483649):bigint>
 
 
 -- !query 2
-select 9223372036854775808, -9223372036854775809
+select 9223372036854775807, -9223372036854775808
--- End diff --

sgtm


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-10 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14580
  
Which NoSQL platforms support `Using Outer Join`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14593: [MINOR][DOCS] Fix style in examples and inconsistent ind...

2016-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14593
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14593: [MINOR][DOCS] Fix style in examples and inconsistent ind...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14593
  
**[Test build #63586 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63586/consoleFull)**
 for PR 14593 at commit 
[`e4a832e`](https://github.com/apache/spark/commit/e4a832e61989297a77dae4a3b6cc1044dd66d499).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14593: [MINOR][DOCS] Fix style in examples and inconsistent ind...

2016-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14593
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63586/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13775: [SPARK-16060][SQL] Vectorized Orc reader

2016-08-10 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/13775
  
@yhuai You mean just using `sql("SELECT * FROM t").count()`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13775: [SPARK-16060][SQL] Vectorized Orc reader

2016-08-10 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/13775#discussion_r74369496
  
--- Diff: 
sql/hive/src/main/java/org/apache/hadoop/hive/ql/io/orc/VectorizedSparkOrcNewRecordReader.java
 ---
@@ -0,0 +1,318 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.io.orc;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.commons.lang.NotImplementedException;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hive.ql.exec.vector.ColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.LongColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch;
+import org.apache.hadoop.io.NullWritable;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapreduce.InputSplit;
+import org.apache.hadoop.mapreduce.TaskAttemptContext;
+import org.apache.hadoop.mapreduce.lib.input.FileSplit;
+
+import org.apache.spark.sql.catalyst.InternalRow;
+import org.apache.spark.sql.catalyst.util.ArrayData;
+import org.apache.spark.sql.catalyst.util.MapData;
+import org.apache.spark.sql.types.DataType;
+import org.apache.spark.sql.types.Decimal;
+import org.apache.spark.unsafe.types.CalendarInterval;
+import org.apache.spark.unsafe.types.UTF8String;
+
+/**
+ * A RecordReader that returns InternalRow for Spark SQL execution.
+ * This reader uses an internal reader that returns Hive's 
VectorizedRowBatch. An adapter
+ * class is used to return internal row by directly accessing data in 
column vectors.
+ */
+public class VectorizedSparkOrcNewRecordReader
+extends org.apache.hadoop.mapreduce.RecordReader {
+  private final org.apache.hadoop.mapred.RecordReader reader;
+  private final int numColumns;
+  private VectorizedRowBatch internalValue;
+  private float progress = 0.0f;
+  private List columnIDs;
+
+  private long numRowsOfBatch = 0;
+  private int indexOfRow = 0;
+
+  private final Row row;
+
+  public VectorizedSparkOrcNewRecordReader(
+  Reader file,
+  JobConf conf,
+  FileSplit fileSplit,
+  List columnIDs) throws IOException {
+List types = file.getTypes();
+numColumns = (types.size() == 0) ? 0 : types.get(0).getSubtypesCount();
+this.reader = new SparkVectorizedOrcRecordReader(file, conf,
+  new org.apache.hadoop.mapred.FileSplit(fileSplit));
+
+this.columnIDs = new ArrayList<>(columnIDs);
+this.internalValue = this.reader.createValue();
+this.progress = reader.getProgress();
+this.row = new Row(this.internalValue.cols, this.columnIDs);
+  }
+
+  @Override
+  public void close() throws IOException {
+reader.close();
+  }
+
+  @Override
+  public NullWritable getCurrentKey() throws IOException,
+  InterruptedException {
+return NullWritable.get();
+  }
+
+  @Override
+  public InternalRow getCurrentValue() throws IOException,
+  InterruptedException {
+if (indexOfRow >= numRowsOfBatch) {
+  return null;
+}
+row.rowId = indexOfRow;
+indexOfRow++;
+
+return row;
+  }
+
+  @Override
+  public float getProgress() throws IOException, InterruptedException {
+return progress;
+  }
+
+  @Override
+  public void initialize(InputSplit split, TaskAttemptContext context)
+  throws IOException, InterruptedException {
+  }
+
+  @Override
+  public boolean nextKeyValue() throws IOException, InterruptedException {
+if 

[GitHub] spark pull request #14592: [SPARK-17011][SQL] Support testing exceptions in ...

2016-08-10 Thread petermaxlee
Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/14592#discussion_r74369492
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/number-format.sql.out ---
@@ -19,16 +19,24 @@ struct<2147483648:bigint,(-2147483649):bigint>
 
 
 -- !query 2
-select 9223372036854775808, -9223372036854775809
+select 9223372036854775807, -9223372036854775808
--- End diff --

Can I add it in a separate pull request? I want to add all literal parsing 
here, but don't want to distract this pull request.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14592: [SPARK-17011][SQL] Support testing exceptions in ...

2016-08-10 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14592#discussion_r74369336
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/number-format.sql.out ---
@@ -19,16 +19,24 @@ struct<2147483648:bigint,(-2147483649):bigint>
 
 
 -- !query 2
-select 9223372036854775808, -9223372036854775809
+select 9223372036854775807, -9223372036854775808
--- End diff --

can you also add the boundary conditions for int as well? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14592: [SPARK-17011][SQL] Support testing exceptions in ...

2016-08-10 Thread petermaxlee
Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/14592#discussion_r74369163
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/number-format.sql.out ---
@@ -19,16 +19,24 @@ struct<2147483648:bigint,(-2147483649):bigint>
 
 
 -- !query 2
-select 9223372036854775808, -9223372036854775809
+select 9223372036854775807, -9223372036854775808
 -- !query 2 schema

-struct<9223372036854775808:decimal(19,0),(-9223372036854775809):decimal(19,0)>
+struct<9223372036854775807:bigint,(-9223372036854775808):decimal(19,0)>
--- End diff --

Here it is https://issues.apache.org/jira/browse/SPARK-17013


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14593: [MINOR][DOCS] Fix style in examples and inconsistent ind...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14593
  
**[Test build #63586 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63586/consoleFull)**
 for PR 14593 at commit 
[`e4a832e`](https://github.com/apache/spark/commit/e4a832e61989297a77dae4a3b6cc1044dd66d499).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14593: [MINOR][DOCS] Fix style in examples and inconsistent ind...

2016-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14593
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63583/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14593: [MINOR][DOCS] Fix style in examples and inconsistent ind...

2016-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14593
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14592: [SPARK-17011][SQL] Support testing exceptions in ...

2016-08-10 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14592#discussion_r74369002
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/number-format.sql.out ---
@@ -19,16 +19,24 @@ struct<2147483648:bigint,(-2147483649):bigint>
 
 
 -- !query 2
-select 9223372036854775808, -9223372036854775809
+select 9223372036854775807, -9223372036854775808
 -- !query 2 schema

-struct<9223372036854775808:decimal(19,0),(-9223372036854775809):decimal(19,0)>
+struct<9223372036854775807:bigint,(-9223372036854775808):decimal(19,0)>
--- End diff --

@petermaxlee can you file a jira ticket for this bug?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14593: [MINOR][DOCS] Fix style in examples and inconsistent ind...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14593
  
**[Test build #63583 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63583/consoleFull)**
 for PR 14593 at commit 
[`3cff947`](https://github.com/apache/spark/commit/3cff9477b10814d8fc9eeb27556b285e01d38956).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * 
`[Tokenization](http://en.wikipedia.org/wiki/Lexical_analysis#Tokenization) is 
the process of taking text (such as a sentence) and breaking it into individual 
terms (usually words). A simple 
[Tokenizer](api/scala/index.html#org.apache.spark.ml.feature.Tokenizer) class 
provides this functionality. The example below shows how to split sentences 
into sequences of words.`
  * `* *(Breaking change)* The `apply` and `copy` methods for the case 
class 
[`BoostingStrategy`](api/scala/index.html#org.apache.spark.mllib.tree.configuration.BoostingStrategy)
 have been changed because of a modification to the case class fields. This 
could be an issue for users who use `BoostingStrategy` to set GBT parameters.`
  * `* *(Breaking change)* The return value of 
[`LDA.run`](api/scala/index.html#org.apache.spark.mllib.clustering.LDA) has 
changed. It now returns an abstract class `LDAModel` instead of the concrete 
class `DistributedLDAModel`. The object of type `LDAModel` can still be cast to 
the appropriate concrete type, which depends on the optimization algorithm.`
  * `* In `DecisionTree`, the deprecated class method `train` has been 
removed. (The object/static `train` methods remain.)`
  * `* The `scoreCol` output column (with default value \"score\") was 
renamed to be `probabilityCol` (with default value \"probability\"). The type 
was originally `Double` (for the probability of class 1.0), but it is now 
`Vector` (for the probability of each class, to support multiclass 
classification in the future).`
  * `labels - the number of times any class was predicted correctly (true 
positives) normalized by the number of data`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14592: [SPARK-17011][SQL] Support testing exceptions in ...

2016-08-10 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14592#discussion_r74368974
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/number-format.sql.out ---
@@ -19,16 +19,24 @@ struct<2147483648:bigint,(-2147483649):bigint>
 
 
 -- !query 2
-select 9223372036854775808, -9223372036854775809
+select 9223372036854775807, -9223372036854775808
 -- !query 2 schema

-struct<9223372036854775808:decimal(19,0),(-9223372036854775809):decimal(19,0)>
+struct<9223372036854775807:bigint,(-9223372036854775808):decimal(19,0)>
--- End diff --

I'd call this a bug. I tried in Spark 1.6 and it was returning double 
(which was worse).

Here's postgres:
```
rxin=# select pg_typeof(-9223372036854775808);
 pg_typeof 
---
 bigint
(1 row)

rxin=# select pg_typeof(-9223372036854775807);
 pg_typeof 
---
 bigint
(1 row)

rxin=# select pg_typeof(-9223372036854775806);
 pg_typeof 
---
 bigint
(1 row)
```




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13775: [SPARK-16060][SQL] Vectorized Orc reader

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13775
  
**[Test build #63585 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63585/consoleFull)**
 for PR 13775 at commit 
[`06066eb`](https://github.com/apache/spark/commit/06066eb241eb97c4cf363adff2b0160b8a423ab8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13775: [SPARK-16060][SQL] Vectorized Orc reader

2016-08-10 Thread dafrista
Github user dafrista commented on a diff in the pull request:

https://github.com/apache/spark/pull/13775#discussion_r74368871
  
--- Diff: 
sql/hive/src/main/java/org/apache/hadoop/hive/ql/io/orc/VectorizedSparkOrcNewRecordReader.java
 ---
@@ -0,0 +1,318 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.io.orc;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.commons.lang.NotImplementedException;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hive.ql.exec.vector.ColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.LongColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch;
+import org.apache.hadoop.io.NullWritable;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapreduce.InputSplit;
+import org.apache.hadoop.mapreduce.TaskAttemptContext;
+import org.apache.hadoop.mapreduce.lib.input.FileSplit;
+
+import org.apache.spark.sql.catalyst.InternalRow;
+import org.apache.spark.sql.catalyst.util.ArrayData;
+import org.apache.spark.sql.catalyst.util.MapData;
+import org.apache.spark.sql.types.DataType;
+import org.apache.spark.sql.types.Decimal;
+import org.apache.spark.unsafe.types.CalendarInterval;
+import org.apache.spark.unsafe.types.UTF8String;
+
+/**
+ * A RecordReader that returns InternalRow for Spark SQL execution.
+ * This reader uses an internal reader that returns Hive's 
VectorizedRowBatch. An adapter
+ * class is used to return internal row by directly accessing data in 
column vectors.
+ */
+public class VectorizedSparkOrcNewRecordReader
+extends org.apache.hadoop.mapreduce.RecordReader {
+  private final org.apache.hadoop.mapred.RecordReader reader;
+  private final int numColumns;
+  private VectorizedRowBatch internalValue;
+  private float progress = 0.0f;
+  private List columnIDs;
+
+  private long numRowsOfBatch = 0;
+  private int indexOfRow = 0;
+
+  private final Row row;
+
+  public VectorizedSparkOrcNewRecordReader(
+  Reader file,
+  JobConf conf,
+  FileSplit fileSplit,
+  List columnIDs) throws IOException {
+List types = file.getTypes();
+numColumns = (types.size() == 0) ? 0 : types.get(0).getSubtypesCount();
+this.reader = new SparkVectorizedOrcRecordReader(file, conf,
+  new org.apache.hadoop.mapred.FileSplit(fileSplit));
+
+this.columnIDs = new ArrayList<>(columnIDs);
+this.internalValue = this.reader.createValue();
+this.progress = reader.getProgress();
+this.row = new Row(this.internalValue.cols, this.columnIDs);
+  }
+
+  @Override
+  public void close() throws IOException {
+reader.close();
+  }
+
+  @Override
+  public NullWritable getCurrentKey() throws IOException,
+  InterruptedException {
+return NullWritable.get();
+  }
+
+  @Override
+  public InternalRow getCurrentValue() throws IOException,
+  InterruptedException {
+if (indexOfRow >= numRowsOfBatch) {
+  return null;
+}
+row.rowId = indexOfRow;
+indexOfRow++;
+
+return row;
+  }
+
+  @Override
+  public float getProgress() throws IOException, InterruptedException {
+return progress;
+  }
+
+  @Override
+  public void initialize(InputSplit split, TaskAttemptContext context)
+  throws IOException, InterruptedException {
+  }
+
+  @Override
+  public boolean nextKeyValue() throws IOException, InterruptedException {
+if 

[GitHub] spark issue #14592: [SPARK-17011][SQL] Support testing exceptions in SQLQuer...

2016-08-10 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14592
  
I'd like to do it incrementally, and ideally one SQL testing file(xxx.sql) 
one PR, but we can have many PRs at the same time, they are not likely to get 
conflicted.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14590: [SPARK-17008][SPARK-17009][SQL] Normalization and...

2016-08-10 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14590#discussion_r74368764
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
@@ -126,14 +129,18 @@ class SQLQueryTestSuite extends QueryTest with 
SharedSQLContext {
   cleaned.split("(?<=[^]);").map(_.trim).filter(_ != "").toSeq
 }
 
+// Create a local SparkSession to have stronger isolation between 
different test cases.
+// This does not isolate catalog changes.
+val localSparkSession = spark.newSession()
--- End diff --

SparkSession should be fine. SparkContext is the expensive one.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14592: [SPARK-17011][SQL] Support testing exceptions in ...

2016-08-10 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14592#discussion_r74368708
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/number-format.sql.out ---
@@ -19,16 +19,24 @@ struct<2147483648:bigint,(-2147483649):bigint>
 
 
 -- !query 2
-select 9223372036854775808, -9223372036854775809
+select 9223372036854775807, -9223372036854775808
 -- !query 2 schema

-struct<9223372036854775808:decimal(19,0),(-9223372036854775809):decimal(19,0)>
+struct<9223372036854775807:bigint,(-9223372036854775808):decimal(19,0)>
--- End diff --

maybe a parser bug? cc @hvanhovell 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14580
  
**[Test build #63584 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63584/consoleFull)**
 for PR 14580 at commit 
[`ddb4ddd`](https://github.com/apache/spark/commit/ddb4dddb1829098ef012cc63ddf059d663b8454b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14590: [SPARK-17008][SPARK-17009][SQL] Normalization and...

2016-08-10 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14590#discussion_r74368596
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
@@ -126,14 +129,18 @@ class SQLQueryTestSuite extends QueryTest with 
SharedSQLContext {
   cleaned.split("(?<=[^]);").map(_.trim).filter(_ != "").toSeq
 }
 
+// Create a local SparkSession to have stronger isolation between 
different test cases.
+// This does not isolate catalog changes.
+val localSparkSession = spark.newSession()
--- End diff --

Is it expensive? I do remember other tests share one spark session for 
performance reasons.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14568: [SPARK-10868] monotonicallyIncreasingId() supports offse...

2016-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14568
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14568: [SPARK-10868] monotonicallyIncreasingId() supports offse...

2016-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14568
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63576/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14580
  
I think we should think out of the SQL box. We know that Spark is not a 
subset of DBMS.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14568: [SPARK-10868] monotonicallyIncreasingId() supports offse...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14568
  
**[Test build #63576 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63576/consoleFull)**
 for PR 14568 at commit 
[`b4d4ea6`](https://github.com/apache/spark/commit/b4d4ea6213d1792e76a25cfe385fb2e3f11bfb6e).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13775: [SPARK-16060][SQL] Vectorized Orc reader

2016-08-10 Thread yhuai
Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/13775
  
for the benchmark, how about we just test the scan operation?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14593: [MINOR][DOCS] Fix style in examples and inconsistent ind...

2016-08-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14593
  
Also just for reviewers, the inconsistent stuffs I listed in the PR 
description happen randomly across documentation. So, this fixes them to be 
consistent according to style guide lines and resembling the majority in 
documentation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14593: [MINOR][DOCS] Fix style in examples and inconsistent ind...

2016-08-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14593
  
BTW, this is not fixing some wrong examples and inconsistent indentation 
codes in `structured-streaming-programming-guide.md` because 
https://github.com/apache/spark/pull/14564 is handling them. I made a separate 
PR for this because that PR is originally about fixing codes in `./examples`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14593: [MINOR][DOCS] Fix style in examples and inconsistent ind...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14593
  
**[Test build #63583 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63583/consoleFull)**
 for PR 14593 at commit 
[`3cff947`](https://github.com/apache/spark/commit/3cff9477b10814d8fc9eeb27556b285e01d38956).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14593: [MINOR][DOCS] Fix style in examples and inconsist...

2016-08-10 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/14593

[MINOR][DOCS] Fix style in examples and inconsistent indentation across 
documentation

## What changes were proposed in this pull request?

This PR fixes the documentation as below:

 - Remove unnecessary spaces which is inconsistent spacing across 
documentation.

 - Fix the style in examples in documentation. This includes below:

   - Python has 4 spaces and Java and Scala has 2 spaces (See 
https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide).

   - Avoid excessive parentheses and curly braces for anonymous functions. 
(See https://github.com/databricks/scala-style-guide#anonymous)

 - Make consistent indentation for XML.

 - Remove trailing multiple whitespaces at the end of file and lines

## How was this patch tested?

N/A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark minor-documentation

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14593.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14593


commit c7d3e7b10b6bec585361aadef43e1f2046c0f5e2
Author: hyukjinkwon 
Date:   2016-08-11T04:06:39Z

Fix style in examples and inconsistent indentation across documentation

commit 3cff9477b10814d8fc9eeb27556b285e01d38956
Author: hyukjinkwon 
Date:   2016-08-11T04:36:18Z

Fix all similar instances across documentation




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14592: [SPARK-17011][SQL] Support testing exceptions in SQLQuer...

2016-08-10 Thread petermaxlee
Github user petermaxlee commented on the issue:

https://github.com/apache/spark/pull/14592
  
@cloud-fan after adding enough features to the test harness, do you think I 
should port all tests over in a single pull request, or more incremental?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-10 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14580
  
Not sure which RDBMS are supporting `Using Outer Join`. `NULL` generated by 
outer joins are removed. This sounds a little bit strange. After all, `NULL` 
also has a meaning. 

In the plan (by EXPLAIN), it is not easy to know this is a regular outer 
join or using outer join. That is why I think we should introduce a new join 
type. At least, users can easily know they are triggering using outer join.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14592: [SPARK-17011][SQL] Support testing exceptions in SQLQuer...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14592
  
**[Test build #63582 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63582/consoleFull)**
 for PR 14592 at commit 
[`76defce`](https://github.com/apache/spark/commit/76defceb9fbaf13ca522da750d92eeb5f7799472).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13775: [SPARK-16060][SQL] Vectorized Orc reader

2016-08-10 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/13775#discussion_r74367827
  
--- Diff: 
sql/hive/src/main/java/org/apache/hadoop/hive/ql/io/orc/VectorizedSparkOrcNewRecordReader.java
 ---
@@ -0,0 +1,318 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.io.orc;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.commons.lang.NotImplementedException;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hive.ql.exec.vector.ColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.LongColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch;
+import org.apache.hadoop.io.NullWritable;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapreduce.InputSplit;
+import org.apache.hadoop.mapreduce.TaskAttemptContext;
+import org.apache.hadoop.mapreduce.lib.input.FileSplit;
+
+import org.apache.spark.sql.catalyst.InternalRow;
+import org.apache.spark.sql.catalyst.util.ArrayData;
+import org.apache.spark.sql.catalyst.util.MapData;
+import org.apache.spark.sql.types.DataType;
+import org.apache.spark.sql.types.Decimal;
+import org.apache.spark.unsafe.types.CalendarInterval;
+import org.apache.spark.unsafe.types.UTF8String;
+
+/**
+ * A RecordReader that returns InternalRow for Spark SQL execution.
+ * This reader uses an internal reader that returns Hive's 
VectorizedRowBatch. An adapter
+ * class is used to return internal row by directly accessing data in 
column vectors.
+ */
+public class VectorizedSparkOrcNewRecordReader
+extends org.apache.hadoop.mapreduce.RecordReader {
+  private final org.apache.hadoop.mapred.RecordReader reader;
+  private final int numColumns;
+  private VectorizedRowBatch internalValue;
+  private float progress = 0.0f;
+  private List columnIDs;
+
+  private long numRowsOfBatch = 0;
+  private int indexOfRow = 0;
+
+  private final Row row;
+
+  public VectorizedSparkOrcNewRecordReader(
+  Reader file,
+  JobConf conf,
+  FileSplit fileSplit,
+  List columnIDs) throws IOException {
+List types = file.getTypes();
+numColumns = (types.size() == 0) ? 0 : types.get(0).getSubtypesCount();
+this.reader = new SparkVectorizedOrcRecordReader(file, conf,
+  new org.apache.hadoop.mapred.FileSplit(fileSplit));
+
+this.columnIDs = new ArrayList<>(columnIDs);
+this.internalValue = this.reader.createValue();
+this.progress = reader.getProgress();
+this.row = new Row(this.internalValue.cols, this.columnIDs);
+  }
+
+  @Override
+  public void close() throws IOException {
+reader.close();
+  }
+
+  @Override
+  public NullWritable getCurrentKey() throws IOException,
+  InterruptedException {
+return NullWritable.get();
+  }
+
+  @Override
+  public InternalRow getCurrentValue() throws IOException,
+  InterruptedException {
+if (indexOfRow >= numRowsOfBatch) {
+  return null;
+}
+row.rowId = indexOfRow;
+indexOfRow++;
+
+return row;
+  }
+
+  @Override
+  public float getProgress() throws IOException, InterruptedException {
+return progress;
+  }
+
+  @Override
+  public void initialize(InputSplit split, TaskAttemptContext context)
+  throws IOException, InterruptedException {
+  }
+
+  @Override
+  public boolean nextKeyValue() throws IOException, InterruptedException {
+if 

[GitHub] spark issue #14588: [SPARK-17005][SQL] fix method tpe in trait AnnotationApi...

2016-08-10 Thread keypointt
Github user keypointt commented on the issue:

https://github.com/apache/spark/pull/14588
  
Oh I'm so sorry. It's breaking 2.10 anyway.

I'll double check scala version compatibility next time submitting a PR


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14588: [SPARK-17005][SQL] fix method tpe in trait Annota...

2016-08-10 Thread keypointt
Github user keypointt closed the pull request at:

https://github.com/apache/spark/pull/14588


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid per-record type dispatch in JSO...

2016-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14102
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63575/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid per-record type dispatch in JSO...

2016-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14102
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid per-record type dispatch in JSO...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14102
  
**[Test build #63575 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63575/consoleFull)**
 for PR 14102 at commit 
[`bceda7b`](https://github.com/apache/spark/commit/bceda7ba4f06c0b6fd99f11ef2662f9f3a154af0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14589: [SPARK-17007][SQL] Move test data files into a te...

2016-08-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14589


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14592: [SPARK-17011][SQL] Support testing exceptions in SQLQuer...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14592
  
**[Test build #63580 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63580/consoleFull)**
 for PR 14592 at commit 
[`1a7cdc0`](https://github.com/apache/spark/commit/1a7cdc029f1cba22f7b8c59eaa241575b287983f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14583: [SPARK-16994][SQL] PushDownPredicate should not ignore l...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14583
  
**[Test build #63581 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63581/consoleFull)**
 for PR 14583 at commit 
[`d23d348`](https://github.com/apache/spark/commit/d23d348bb0c88211d87063bceaaabff7cc7a8a7a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14589: [SPARK-17007][SQL] Move test data files into a test-data...

2016-08-10 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14589
  
Thanks - merging in master.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14592: [SPARK-17011][SQL] Support testing exceptions in ...

2016-08-10 Thread petermaxlee
Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/14592#discussion_r74367468
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/number-format.sql.out ---
@@ -19,16 +19,24 @@ struct<2147483648:bigint,(-2147483649):bigint>
 
 
 -- !query 2
-select 9223372036854775808, -9223372036854775809
+select 9223372036854775807, -9223372036854775808
 -- !query 2 schema

-struct<9223372036854775808:decimal(19,0),(-9223372036854775809):decimal(19,0)>
+struct<9223372036854775807:bigint,(-9223372036854775808):decimal(19,0)>
--- End diff --

Also cc @sarutak who wrote the original test case.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14589: [SPARK-17007][SQL] Move test data files into a test-data...

2016-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14589
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14589: [SPARK-17007][SQL] Move test data files into a test-data...

2016-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14589
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63573/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14592: [SPARK-17011][SQL] Support testing exceptions in ...

2016-08-10 Thread petermaxlee
Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/14592#discussion_r74367410
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/number-format.sql.out ---
@@ -19,16 +19,24 @@ struct<2147483648:bigint,(-2147483649):bigint>
 
 
 -- !query 2
-select 9223372036854775808, -9223372036854775809
+select 9223372036854775807, -9223372036854775808
 -- !query 2 schema

-struct<9223372036854775808:decimal(19,0),(-9223372036854775809):decimal(19,0)>
+struct<9223372036854775807:bigint,(-9223372036854775808):decimal(19,0)>
--- End diff --

"-9223372036854775808" is a valid long value (Long.MinValue) but Spark 
treats it as a decimal(19, 0) because "9223372036854775808" is out of range. Is 
this expected?

cc @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14589: [SPARK-17007][SQL] Move test data files into a test-data...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14589
  
**[Test build #63573 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63573/consoleFull)**
 for PR 14589 at commit 
[`3bc7c03`](https://github.com/apache/spark/commit/3bc7c03cb7ea226e2ace29e771b9b64eee91d13d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and GROUP ...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14546
  
**[Test build #63579 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63579/consoleFull)**
 for PR 14546 at commit 
[`32c639c`](https://github.com/apache/spark/commit/32c639c49d23f0873b5fcc4c28fa809ee87f7005).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14592: [SPARK-17011][SQL] Support testing exceptions in ...

2016-08-10 Thread petermaxlee
GitHub user petermaxlee opened a pull request:

https://github.com/apache/spark/pull/14592

[SPARK-17011][SQL] Support testing exceptions in SQLQueryTestSuite

## What changes were proposed in this pull request?
This patch adds exception testing to SQLQueryTestSuite. When there is an 
exception in query execution, the query result contains the the exception class 
along with the exception message.

As part of this, I moved some additional test cases for limit from 
SQLQuerySuite over to SQLQueryTestSuite.

## How was this patch tested?
This is a test harness change.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/petermaxlee/spark SPARK-17011

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14592.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14592


commit 1a7cdc029f1cba22f7b8c59eaa241575b287983f
Author: petermaxlee 
Date:   2016-08-11T04:19:56Z

[SPARK-17011][SQL] Support testing exceptions in SQLQueryTestSuite




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14397
  
**[Test build #63578 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63578/consoleFull)**
 for PR 14397 at commit 
[`178813e`](https://github.com/apache/spark/commit/178813ebf6e7d5f58ebab7784e07bfd5b8c5d883).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and GROUP ...

2016-08-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14546
  
Thank you for review, @gatorsmile .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14397
  
**[Test build #63577 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63577/consoleFull)**
 for PR 14397 at commit 
[`624bb3d`](https://github.com/apache/spark/commit/624bb3d9f6ffe558c1897501c06c76f938e15602).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

2016-08-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14397
  
Rebased just to resolve conflicts.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution in CTE by ...

2016-08-10 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/14452
  
@gatorsmile Do you have concrete example for that?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14590: [SPARK-17008][SPARK-17009][SQL] Normalization and...

2016-08-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14590


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14590: [SPARK-17008][SPARK-17009][SQL] Normalization and isolat...

2016-08-10 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14590
  
The failed Python test is unrelated. I'm going to merge this in master. 
Thanks.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14590: [SPARK-17008][SPARK-17009][SQL] Normalization and...

2016-08-10 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14590#discussion_r74366616
  
--- Diff: sql/core/src/test/resources/sql-tests/results/datetime.sql.out ---
@@ -0,0 +1,10 @@
+-- Automatically generated by org.apache.spark.sql.SQLQueryTestSuite
--- End diff --

It might be better to remove the package name so we don't need to change 
all the generated files when we move this class.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-10 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14580
  
Already pinged the previously involved Committers. Let us see what are 
their feedbacks. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14580
  
Yep. `EliminateOuterJoin` should be updated properly. Any idea? If you have 
more general idea, you can make a PR to override this. You made this optimizer. 
:)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-10 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14580
  
That is a public API. We are unable to remove it. 
https://github.com/apache/spark/pull/8600 has a serious bug. It has been fixed 
in another PR: https://github.com/apache/spark/pull/10353. 

Now, the issue is how to deal with using/natural outer join. Maybe we can 
introduce new join types. Or, in the rule, we can find a hacky way to know 
whether this outer join is nature/using joins.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14580
  
Yep. Exactly. That is what I mean. That is not a regular outer join you 
considered in this optimizer and now both features are Spark.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14590: [SPARK-17008][SPARK-17009][SQL] Normalization and isolat...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14590
  
**[Test build #3216 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3216/consoleFull)**
 for PR 14590 at commit 
[`e061820`](https://github.com/apache/spark/commit/e0618203c317f8b8211c0e983403834f8e39a950).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-10 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14580
  
For the regular outer join, the rule works fine. 

The issue you hit is caused by "using outer join" + "outer join 
elimination". Thus, your fix does not  resolve the root issue. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14590: [SPARK-17008][SPARK-17009][SQL] Normalization and isolat...

2016-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14590
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14590: [SPARK-17008][SPARK-17009][SQL] Normalization and isolat...

2016-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14590
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63572/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14576: [SPARK-16391][SQL] ReduceAggregator and partial a...

2016-08-10 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14576#discussion_r74365669
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/expressions/ReduceAggregator.scala 
---
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.expressions
+
+import org.apache.spark.annotation.Experimental
+import org.apache.spark.sql.Encoder
+import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
+
+/**
+ * :: Experimental ::
+ * An aggregator that uses a single associative and commutative reduce 
function. This reduce
+ * function can be used to go through all input values and reduces them to 
a single value.
+ * If there is no input, a null value is returned.
+ *
+ * @since 2.1.0
+ */
+@Experimental
+abstract class ReduceAggregator[T] extends Aggregator[T, (Boolean, T), T] {
+
+  // Question 1: Should func and encoder be parameters rather than 
abstract methods?
+  //  rxin: abstract method has better java compatibility and forces 
naming the concrete impl,
+  //  whereas parameter has better type inference (infer encoders via 
context bounds).
--- End diff --

+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14590: [SPARK-17008][SPARK-17009][SQL] Normalization and isolat...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14590
  
**[Test build #63572 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63572/consoleFull)**
 for PR 14590 at commit 
[`e061820`](https://github.com/apache/spark/commit/e0618203c317f8b8211c0e983403834f8e39a950).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14591: [SPARK-17010][MINOR][DOC]Wrong description in memory man...

2016-08-10 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the issue:

https://github.com/apache/spark/pull/14591
  
@andrewor14 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14588: [SPARK-17005][SQL] fix method tpe in trait AnnotationApi...

2016-08-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14588
  
Please let me cc @srowen to make sure because I believe it is about 
building.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14568: [SPARK-10868] monotonicallyIncreasingId() supports offse...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14568
  
**[Test build #63576 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63576/consoleFull)**
 for PR 14568 at commit 
[`b4d4ea6`](https://github.com/apache/spark/commit/b4d4ea6213d1792e76a25cfe385fb2e3f11bfb6e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14580
  
In addition, I'm wondering if you really want to remove that feature which 
was merged into 1.6 branch on Sep. 21 2015 and already released?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14591: [SPARK-17010][MINOR][DOC]Wrong description in memory man...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14591
  
**[Test build #63574 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63574/consoleFull)**
 for PR 14591 at commit 
[`9d9bc2a`](https://github.com/apache/spark/commit/9d9bc2ae1420d91cea7779f38d329579e1ec126a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14591: [SPARK-17010][MINOR][DOC]Wrong description in memory man...

2016-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14591
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14591: [SPARK-17010][MINOR][DOC]Wrong description in memory man...

2016-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14591
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63574/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14580
  
Hi, @gatorsmile . Thank you for review. 

BTW, could you give me a reason why you think like the following?
> The fix does not look right to me.

What is the root cause which you think? I think I missed your context. For 
me, current optimizer work definitely incorrectly (as we see the reported case) 
and this PR fixes that now. I think this is not a SQL standard issue. If you 
give some counter examples, I can grasp your concern here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14583: [SPARK-16994][SQL] PushDownPredicate should not i...

2016-08-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14583#discussion_r74363622
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -1988,6 +1988,11 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
 }
   }
 
+  test("SPARK-16994: filter should not be pushed down into local limit") {
--- End diff --

Thank you, @gatorsmile .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid per-record type dispatch in JSO...

2016-08-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14102
  
@cloud-fan Thanks! I think it is ready to be reviewed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid per-record type dispatch in JSO...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14102
  
**[Test build #63575 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63575/consoleFull)**
 for PR 14102 at commit 
[`bceda7b`](https://github.com/apache/spark/commit/bceda7ba4f06c0b6fd99f11ef2662f9f3a154af0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14591: [SPARK-17010][MINOR][DOC]Wrong description in memory man...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14591
  
**[Test build #63574 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63574/consoleFull)**
 for PR 14591 at commit 
[`9d9bc2a`](https://github.com/apache/spark/commit/9d9bc2ae1420d91cea7779f38d329579e1ec126a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14591: [SPARK-17010][MINOR][DOC]Wrong description in mem...

2016-08-10 Thread WangTaoTheTonic
GitHub user WangTaoTheTonic opened a pull request:

https://github.com/apache/spark/pull/14591

[SPARK-17010][MINOR][DOC]Wrong description in memory management document

## What changes were proposed in this pull request?

change the remain percent to right one.


## How was this patch tested?

Manual review




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WangTaoTheTonic/spark patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14591.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14591


commit 9d9bc2ae1420d91cea7779f38d329579e1ec126a
Author: Tao Wang 
Date:   2016-08-11T02:44:53Z

Update tuning.md




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14589: [SPARK-17007][SQL] Move test data files into a test-data...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14589
  
**[Test build #63573 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63573/consoleFull)**
 for PR 14589 at commit 
[`3bc7c03`](https://github.com/apache/spark/commit/3bc7c03cb7ea226e2ace29e771b9b64eee91d13d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14589: [SPARK-17007][SQL] Move test data files into a test-data...

2016-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14589
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14589: [SPARK-17007][SQL] Move test data files into a test-data...

2016-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14589
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63571/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14589: [SPARK-17007][SQL] Move test data files into a test-data...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14589
  
**[Test build #63571 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63571/consoleFull)**
 for PR 14589 at commit 
[`17bc9c0`](https://github.com/apache/spark/commit/17bc9c0b259be87782e313f18b2b88de134811af).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...

2016-08-10 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/14567#discussion_r74362668
  
--- Diff: python/pyspark/cloudpickle.py ---
@@ -194,7 +194,7 @@ def save_function(self, obj, name=None):
 # we'll pickle the actual function object rather than simply 
saving a
 # reference (as is done in default pickler), via 
save_function_tuple.
 if islambda(obj) or obj.__code__.co_filename == '' or 
themodule is None:
-#print("save global", islambda(obj), obj.__code__.co_filename, 
modname, themodule)
+# print("save global", islambda(obj), 
obj.__code__.co_filename, modname, themodule)
--- End diff --

Seems like we might just want to remove this commented out line?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...

2016-08-10 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/14567#discussion_r74362636
  
--- Diff: python/pep8rc ---
@@ -0,0 +1,21 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+[pep8]
--- End diff --

There is another pep8config file at ./dev/toxi.ini - seems like it would be 
good to have a single file (also unify the ignore lists)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14590: [SPARK-17008][SPARK-17009][SQL] Normalization and isolat...

2016-08-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14590
  
**[Test build #63572 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63572/consoleFull)**
 for PR 14590 at commit 
[`e061820`](https://github.com/apache/spark/commit/e0618203c317f8b8211c0e983403834f8e39a950).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   >