Github user tilumi commented on the issue:
https://github.com/apache/spark/pull/14129
I optimized the 'codeGenWithArrayAggBufferNumericHistogram' algorithm and
the benchmark result is:
|(rows,
Github user liancheng commented on the issue:
https://github.com/apache/spark/pull/14045
@viirya Thanks for your work! This would be very useful. I'll help review
this one soon after finishing my 2.0 tasks at hand!
---
If your project is set up for it, you can reply to this email
Github user liancheng commented on the issue:
https://github.com/apache/spark/pull/14278
@viirya Basically we are mapping the logic in `ParquetRowConverter` for the
non-vectorized Parquet reader. It's just implemented at a lower lever in the
case of vectorized reader.
---
If your
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14280
**[Test build #62587 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62587/consoleFull)**
for PR 14280 at commit
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14065#discussion_r71480635
--- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -390,8 +391,21 @@ private[spark] class Client(
// Upload Spark and
Github user liancheng commented on the issue:
https://github.com/apache/spark/pull/14278
@viirya The updated schema field in this PR is only used to guide the
vectorized reader to interpret basic Parquet types into logical types (e.g.
Parquet `int32` to Spark `ByteType`, and Parquet
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/14280
It appears that the existing `sed` command works if you write `sed $'...'`
on both Linux and OS X. Is that easier?
---
If your project is set up for it, you can reply to this email and have your
Github user lw-lin commented on a diff in the pull request:
https://github.com/apache/spark/pull/14280#discussion_r71480386
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala
---
@@ -64,14 +67,19 @@ class SQLQuerySuite extends QueryTest with
Github user lw-lin commented on a diff in the pull request:
https://github.com/apache/spark/pull/14280#discussion_r71480336
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala
---
@@ -64,14 +67,19 @@ class SQLQuerySuite extends QueryTest with
Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/14280#discussion_r71479585
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala
---
@@ -64,14 +67,19 @@ class SQLQuerySuite extends QueryTest with
Github user adrian-wang commented on a diff in the pull request:
https://github.com/apache/spark/pull/14280#discussion_r71477889
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala
---
@@ -64,14 +67,19 @@ class SQLQuerySuite extends QueryTest
Github user lw-lin commented on a diff in the pull request:
https://github.com/apache/spark/pull/14280#discussion_r71477795
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala
---
@@ -64,14 +67,19 @@ class SQLQuerySuite extends QueryTest with
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14280
**[Test build #62586 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62586/consoleFull)**
for PR 14280 at commit
GitHub user lw-lin opened a pull request:
https://github.com/apache/spark/pull/14280
[SPARK-16515][SQL][FOLLOW-UP] Fix test `script` on OS X/Windows...
## Problem
OS X's `sed` doesn't understand `\t` at all, so this `script` test would
fail:
```
== Results ==
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14207
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14207
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62581/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14207
**[Test build #62581 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62581/consoleFull)**
for PR 14207 at commit
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/14278
@liancheng I don't think we should use the Spark requested schema for
vectorized Parquet reader. It only works for flat schema. We need the converted
schema for complex type support, as I do in
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14207
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14207
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62580/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14207
**[Test build #62580 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62580/consoleFull)**
for PR 14207 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14277
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14277
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62579/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14277
**[Test build #62579 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62579/consoleFull)**
for PR 14277 at commit
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14279
cc @rxin, @srowen and @deanchen
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14279#discussion_r71474579
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
---
@@ -477,15 +478,15 @@ class CSVSuite extends
Github user liancheng commented on the issue:
https://github.com/apache/spark/pull/14278
Also cc @yhuai.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so,
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/14207#discussion_r71474538
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala
---
@@ -316,27 +340,25 @@ object
Github user liancheng commented on the issue:
https://github.com/apache/spark/pull/14272
Opened #14278 for the simpler yet more general fix.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user liancheng commented on the issue:
https://github.com/apache/spark/pull/14278
@vanzin Could you please help verify this fix? The reason why #14272 works
is that the Parquet requested schema is generated using `clipParquetSchema()`.
---
If your project is set up for it,
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14278
**[Test build #62585 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62585/consoleFull)**
for PR 14278 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14279
**[Test build #62584 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62584/consoleFull)**
for PR 14279 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14132
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14279#discussion_r71474425
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
---
@@ -62,6 +64,10 @@ object DateTimeUtils {
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14132
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62578/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14132
**[Test build #62578 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62578/consoleFull)**
for PR 14132 at commit
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/14279
[SPARK-16216][SQL] Write Timestamp and Date in ISO 8601 formatted string by
default for CSV and JSON
## What changes were proposed in this pull request?
Currently, CSV datasource is
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/14207#discussion_r71474127
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala
---
@@ -316,27 +340,25 @@ object
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14278
**[Test build #62583 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62583/consoleFull)**
for PR 14278 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14086
**[Test build #62582 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62582/consoleFull)**
for PR 14086 at commit
GitHub user liancheng opened a pull request:
https://github.com/apache/spark/pull/14278
[SPARK-16632][SQL] Use Spark requested schema to guide vectorized Parquet
reader initialization
## What changes were proposed in this pull request?
In `SpecificParquetRecordReaderBase`,
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/14086
The descriptions of PR/code are updated.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/14086#discussion_r71472687
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -419,8 +422,13 @@ final class DataFrameWriter[T] private[sql](ds:
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/14207#discussion_r71472473
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala
---
@@ -95,17 +95,41 @@ case class
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14132
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14132
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62577/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14132
**[Test build #62577 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62577/consoleFull)**
for PR 14132 at commit
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/14086#discussion_r71472087
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -419,8 +422,13 @@ final class DataFrameWriter[T] private[sql](ds:
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/14086#discussion_r71471996
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -419,8 +422,13 @@ final class DataFrameWriter[T] private[sql](ds:
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/14207#discussion_r71471942
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
---
@@ -252,6 +252,165 @@ class DDLSuite extends QueryTest with
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/14240#discussion_r71471878
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/sources/PrunedScanSuite.scala ---
@@ -114,16 +114,15 @@ class PrunedScanSuite extends
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/14240#discussion_r71471414
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/sources/PrunedScanSuite.scala ---
@@ -114,16 +114,15 @@ class PrunedScanSuite extends
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/14086#discussion_r71471142
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -419,8 +422,13 @@ final class DataFrameWriter[T] private[sql](ds:
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/14207#discussion_r71471136
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala
---
@@ -95,17 +95,41 @@ case class
Github user wangmiao1981 commented on the issue:
https://github.com/apache/spark/pull/14098
@liancheng Sorry for replying late. I was on vacation last a few days.
I have addressed most of your comments. Only the .md file is not updated
yet.
By the way, I am trying
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/14086#discussion_r71471006
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -419,8 +422,13 @@ final class DataFrameWriter[T] private[sql](ds:
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/14207#discussion_r71470908
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ---
@@ -522,31 +522,31 @@ object DDLUtils {
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/14207#discussion_r71470907
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ---
@@ -522,31 +522,31 @@ object DDLUtils {
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/14086#discussion_r71470873
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -419,8 +422,13 @@ final class DataFrameWriter[T] private[sql](ds:
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/14086#discussion_r71470834
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -419,8 +422,13 @@ final class DataFrameWriter[T] private[sql](ds:
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/14207#discussion_r71470616
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
---
@@ -252,6 +252,165 @@ class DDLSuite extends QueryTest with
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/14086#discussion_r71470476
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -419,8 +422,13 @@ final class DataFrameWriter[T] private[sql](ds:
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/14086#discussion_r71470367
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -419,8 +422,13 @@ final class DataFrameWriter[T] private[sql](ds:
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/14207#discussion_r71470370
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ---
@@ -522,31 +522,31 @@ object DDLUtils {
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/14086#discussion_r71469619
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -419,8 +422,13 @@ final class DataFrameWriter[T] private[sql](ds:
501 - 565 of 565 matches
Mail list logo