[GitHub] spark pull request #13837: [SPARK-16126] [SQL] Better Error Message When usi...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13837


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13837: [SPARK-16126] [SQL] Better Error Message When usi...

2016-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/13837#discussion_r87733959
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 ---
@@ -322,6 +323,9 @@ case class DataSource(
   val equality = sparkSession.sessionState.conf.resolver
   StructType(schema.filterNot(f => 
partitionColumns.exists(equality(_, f.name
 }.orElse {
+  if (allPaths.isEmpty && !format.isInstanceOf[TextFileFormat]) {
--- End diff --

Hi @gatorsmile, would this be better if we explain here text data source is 
excluded because text datasource always uses a schema consisting of a string 
field if the schema is not explicitly given?

BTW, should we maybe change `text.TextFileFormat` to `TextFileFormat ` 
https://github.com/apache/spark/pull/13837/files#diff-7a6cb188d2ae31eb3347b5629a679cecR139
 ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13837: [SPARK-16126] [SQL] Better Error Message When usi...

2016-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/13837#discussion_r87728887
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala
 ---
@@ -40,7 +40,7 @@ private[sql] class ParquetOptions(
 if (!shortParquetCompressionCodecNames.contains(codecName)) {
   val availableCodecs = 
shortParquetCompressionCodecNames.keys.map(_.toLowerCase)
   throw new IllegalArgumentException(s"Codec [$codecName] " +
-s"is not available. Available codecs are 
${availableCodecs.mkString(", ")}.")
+s"is not available. Known codecs are ${availableCodecs.mkString(", 
")}.")
--- End diff --

`Available` was intentionally used because Parquet only supports snappy, 
gzip or lzo whereas text-based supports compression codecs including other 
codecs but that lists the known ones.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13837: [SPARK-16126] [SQL] Better Error Message When usi...

2016-11-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/13837#discussion_r87723510
  
--- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R ---
@@ -2684,8 +2684,7 @@ test_that("Call DataFrameWriter.load() API in Java 
without path and check argume
   # It makes sure that we can omit path argument in read.df API and then 
it calls
   # DataFrameWriter.load() without path.
   expect_error(read.df(source = "json"),
-   paste("Error in loadDF : analysis error - Unable to infer 
schema for JSON at .",
- "It must be specified manually"))
+   paste("Error in loadDF : illegal argument - 'path' is not 
specified"))
--- End diff --

I recall this test is intentionally testing without path argument?
cc @HyukjinKwon 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13837: [SPARK-16126] [SQL] Better Error Message When usi...

2016-11-11 Thread gatorsmile
GitHub user gatorsmile reopened a pull request:

https://github.com/apache/spark/pull/13837

[SPARK-16126] [SQL] Better Error Message When using DataFrameReader without 
`path`

 What changes were proposed in this pull request?

When users do not specify the path in `DataFrameReader` APIs, it can get a 
confusing error message. For example, 

``` Scala
spark.read.json()
```

Error message:

```
Unable to infer schema for JSON at . It must be specified manually;
```

After the fix, the error message will be like: 

```
'path' is not specified
```

Another major goal of this PR is to add test cases for the latest changes 
in https://github.com/apache/spark/pull/13727. 
- orc read APIs
- illegal format name
- save API - empty path or illegal path
- load API - empty path
- illegal compression
- fixed a test case in the existing test case `prevent all column 
partitioning`
 How was this patch tested?

Test cases are added.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark dfWriterAudit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13837.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13837


commit 8d021e47e9a4e95ade99d617c77ef1e17245a796
Author: gatorsmile 
Date:   2016-06-17T18:24:42Z

test cases

commit 5e4a3c666dfb767215130df1a778e5f97d438c54
Author: gatorsmile 
Date:   2016-06-17T19:58:56Z

add test cases.

commit 26437151ff0db4c0010510de047f81b1808890f4
Author: gatorsmile 
Date:   2016-06-17T23:48:23Z

fix and test cases

commit cfc0188a0baa45aef1bae6604dd10450eaafd561
Author: gatorsmile 
Date:   2016-06-21T01:59:02Z

Merge remote-tracking branch 'upstream/master' into dfWriterAudit

commit 3007fe66d03a6a40dc530c13d44c27030118a8a4
Author: gatorsmile 
Date:   2016-06-21T13:27:16Z

more test case

commit a1ae7249322c17ea09be4e968535dc115b2acb64
Author: gatorsmile 
Date:   2016-06-22T06:12:56Z

fix test case

commit 635046a10cc059a6ae8756fb7bc7167f5621255c
Author: gatorsmile 
Date:   2016-06-22T16:04:51Z

fix test case




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13837: [SPARK-16126] [SQL] Better Error Message When usi...

2016-08-21 Thread gatorsmile
Github user gatorsmile closed the pull request at:

https://github.com/apache/spark/pull/13837


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13837: [SPARK-16126] [SQL] Better Error Message When usi...

2016-06-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/13837#discussion_r68341391
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala
 ---
@@ -40,7 +40,7 @@ private[sql] class ParquetOptions(
 if (!shortParquetCompressionCodecNames.contains(codecName)) {
   val availableCodecs = 
shortParquetCompressionCodecNames.keys.map(_.toLowerCase)
   throw new IllegalArgumentException(s"Codec [$codecName] " +
-s"is not available. Available codecs are 
${availableCodecs.mkString(", ")}.")
+s"is not available. Known codecs are ${availableCodecs.mkString(", 
")}.")
--- End diff --

Just to make it consistent with the output of the other cases. See the 
code: 
https://github.com/apache/spark/blob/d6dc12ef0146ae409834c78737c116050961f350/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/CompressionCodecs.scala#L49-L51


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13837: [SPARK-16126] [SQL] Better Error Message When usi...

2016-06-23 Thread tdas
Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13837#discussion_r68324497
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala
 ---
@@ -40,7 +40,7 @@ private[sql] class ParquetOptions(
 if (!shortParquetCompressionCodecNames.contains(codecName)) {
   val availableCodecs = 
shortParquetCompressionCodecNames.keys.map(_.toLowerCase)
   throw new IllegalArgumentException(s"Codec [$codecName] " +
-s"is not available. Available codecs are 
${availableCodecs.mkString(", ")}.")
+s"is not available. Known codecs are ${availableCodecs.mkString(", 
")}.")
--- End diff --

why this change?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org