[jira] [Commented] (SPARK-39005) Introduce 4 new functions to KVUtils

2022-04-23 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526967#comment-17526967
 ] 

Apache Spark commented on SPARK-39005:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/36331

> Introduce 4 new functions to KVUtils
> 
>
> Key: SPARK-39005
> URL: https://issues.apache.org/jira/browse/SPARK-39005
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Minor
>
> Introduce 4 new functions
>  * count:Counts the number of elements in the KVStoreView which satisfy a 
> predicate.
>  * foreach:Applies a function f to all values produced by KVStoreView.
>  * mapToSeq:Maps all values of KVStoreView to new Seq using a transformation 
> function.
>  * size: The size of KVStoreView.
> And use the above functions to simplify the code related to `KVStoreView`, 
> and the above functions will release `LevelDB/RocksDBIterator` earlier.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-39005) Introduce 4 new functions to KVUtils

2022-04-23 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-39005:


Assignee: Apache Spark

> Introduce 4 new functions to KVUtils
> 
>
> Key: SPARK-39005
> URL: https://issues.apache.org/jira/browse/SPARK-39005
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Minor
>
> Introduce 4 new functions
>  * count:Counts the number of elements in the KVStoreView which satisfy a 
> predicate.
>  * foreach:Applies a function f to all values produced by KVStoreView.
>  * mapToSeq:Maps all values of KVStoreView to new Seq using a transformation 
> function.
>  * size: The size of KVStoreView.
> And use the above functions to simplify the code related to `KVStoreView`, 
> and the above functions will release `LevelDB/RocksDBIterator` earlier.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-39005) Introduce 4 new functions to KVUtils

2022-04-23 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-39005:


Assignee: (was: Apache Spark)

> Introduce 4 new functions to KVUtils
> 
>
> Key: SPARK-39005
> URL: https://issues.apache.org/jira/browse/SPARK-39005
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Minor
>
> Introduce 4 new functions
>  * count:Counts the number of elements in the KVStoreView which satisfy a 
> predicate.
>  * foreach:Applies a function f to all values produced by KVStoreView.
>  * mapToSeq:Maps all values of KVStoreView to new Seq using a transformation 
> function.
>  * size: The size of KVStoreView.
> And use the above functions to simplify the code related to `KVStoreView`, 
> and the above functions will release `LevelDB/RocksDBIterator` earlier.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-39005) Introduce 4 new functions to KVUtils

2022-04-23 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526966#comment-17526966
 ] 

Apache Spark commented on SPARK-39005:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/36331

> Introduce 4 new functions to KVUtils
> 
>
> Key: SPARK-39005
> URL: https://issues.apache.org/jira/browse/SPARK-39005
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Minor
>
> Introduce 4 new functions
>  * count:Counts the number of elements in the KVStoreView which satisfy a 
> predicate.
>  * foreach:Applies a function f to all values produced by KVStoreView.
>  * mapToSeq:Maps all values of KVStoreView to new Seq using a transformation 
> function.
>  * size: The size of KVStoreView.
> And use the above functions to simplify the code related to `KVStoreView`, 
> and the above functions will release `LevelDB/RocksDBIterator` earlier.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39005) Introduce 4 new functions to KVUtils

2022-04-23 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-39005:
-
Summary: Introduce 4 new functions to KVUtils  (was: Introduce more 
functions to KVUtils)

> Introduce 4 new functions to KVUtils
> 
>
> Key: SPARK-39005
> URL: https://issues.apache.org/jira/browse/SPARK-39005
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Minor
>
> Introduce 4 new functions
>  * count:Counts the number of elements in the KVStoreView which satisfy a 
> predicate.
>  * foreach:Applies a function f to all values produced by KVStoreView.
>  * mapToSeq:Maps all values of KVStoreView to new Seq using a transformation 
> function.
>  * size: The size of KVStoreView.
> And use the above functions to simplify the code related to `KVStoreView`, 
> and the above functions will release `LevelDB/RocksDBIterator` earlier.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39005) Introduce more functions to KVUtils

2022-04-23 Thread Yang Jie (Jira)
Yang Jie created SPARK-39005:


 Summary: Introduce more functions to KVUtils
 Key: SPARK-39005
 URL: https://issues.apache.org/jira/browse/SPARK-39005
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.4.0
Reporter: Yang Jie


Introduce 4 new functions
 * count:Counts the number of elements in the KVStoreView which satisfy a 
predicate.
 * foreach:Applies a function f to all values produced by KVStoreView.
 * mapToSeq:Maps all values of KVStoreView to new Seq using a transformation 
function.
 * size: The size of KVStoreView.

And use the above functions to simplify the code related to `KVStoreView`, and 
the above functions will release `LevelDB/RocksDBIterator` earlier.

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39004) Flaky test: Bad node with multiple executors, job will still succeed with the right confs

2022-04-23 Thread Yang Jie (Jira)
Yang Jie created SPARK-39004:


 Summary: Flaky test:  Bad node with multiple executors, job will 
still succeed with the right confs
 Key: SPARK-39004
 URL: https://issues.apache.org/jira/browse/SPARK-39004
 Project: Spark
  Issue Type: Bug
  Components: Tests
Affects Versions: 3.4.0
Reporter: Yang Jie


 
{code:java}
2022-04-22T17:02:13.5414094Z HealthTrackerIntegrationSuite:
2022-04-22T17:02:16.8832352Z - Bad node with multiple executors, job will still 
succeed with the right confs *** FAILED *** (47 milliseconds)
2022-04-22T17:02:16.8835538Z   Map() did not equal Map(0 -> 42, 5 -> 42, 1 -> 
42, 6 -> 42, 9 -> 42, 2 -> 42, 7 -> 42, 3 -> 42, 8 -> 42, 4 -> 42) 
(HealthTrackerIntegrationSuite.scala:94)
2022-04-22T17:02:16.8841265Z   Analysis:
2022-04-22T17:02:16.8841865Z   HashMap(0: -> 42, 1: -> 42, 2: -> 42, 3: -> 42, 
4: -> 42, 5: -> 42, 6: -> 42, 7: -> 42, 8: -> 42, 9: -> 42)
2022-04-22T17:02:16.8845311Z   org.scalatest.exceptions.TestFailedException:
2022-04-22T17:02:16.8846292Z   at 
org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)
2022-04-22T17:02:16.8849288Z   at 
org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)
2022-04-22T17:02:16.8850381Z   at 
org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1231)
2022-04-22T17:02:16.8965003Z   at 
org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:1295)
2022-04-22T17:02:16.8966291Z   at 
org.apache.spark.scheduler.HealthTrackerIntegrationSuite.$anonfun$new$7(HealthTrackerIntegrationSuite.scala:94)
2022-04-22T17:02:16.8967464Z   at 
org.apache.spark.scheduler.SchedulerIntegrationSuite.$anonfun$testScheduler$1(SchedulerIntegrationSuite.scala:98)
2022-04-22T17:02:16.8968433Z   at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
2022-04-22T17:02:16.8969230Z   at 
org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
2022-04-22T17:02:16.8970555Z   at 
org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
2022-04-22T17:02:16.9047570Z   at 
org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
2022-04-22T17:02:16.9050180Z   at 
org.scalatest.Transformer.apply(Transformer.scala:22)
2022-04-22T17:02:16.9052652Z   at 
org.scalatest.Transformer.apply(Transformer.scala:20)
2022-04-22T17:02:16.9055387Z   at 
org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:190)
2022-04-22T17:02:16.9058514Z   at 
org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:203)
2022-04-22T17:02:16.9063456Z   at 
org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:188)
2022-04-22T17:02:16.9071013Z   at 
org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:200)
2022-04-22T17:02:16.9073556Z   at 
org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
2022-04-22T17:02:16.9076457Z   at 
org.scalatest.funsuite.AnyFunSuiteLike.runTest(AnyFunSuiteLike.scala:200)
2022-04-22T17:02:16.9078537Z   at 
org.scalatest.funsuite.AnyFunSuiteLike.runTest$(AnyFunSuiteLike.scala:182)
2022-04-22T17:02:16.9080596Z   at 
org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:64)
2022-04-22T17:02:16.9082663Z   at 
org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:234)
2022-04-22T17:02:16.9084136Z   at 
org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:227)
2022-04-22T17:02:16.9085630Z   at 
org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:64)
2022-04-22T17:02:16.9087111Z   at 
org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTests$1(AnyFunSuiteLike.scala:233)
2022-04-22T17:02:16.9088555Z   at 
org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413)
2022-04-22T17:02:16.9089953Z   at 
scala.collection.immutable.List.foreach(List.scala:431)
2022-04-22T17:02:16.9091357Z   at 
org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
2022-04-22T17:02:16.9092789Z   at 
org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:396)
2022-04-22T17:02:16.9094202Z   at 
org.scalatest.SuperEngine.runTestsImpl(Engine.scala:475)
2022-04-22T17:02:16.9095791Z   at 
org.scalatest.funsuite.AnyFunSuiteLike.runTests(AnyFunSuiteLike.scala:233)
2022-04-22T17:02:16.9097461Z   at 
org.scalatest.funsuite.AnyFunSuiteLike.runTests$(AnyFunSuiteLike.scala:232)
2022-04-22T17:02:16.9098949Z   at 
org.scalatest.funsuite.AnyFunSuite.runTests(AnyFunSuite.scala:1563)
2022-04-22T17:02:16.9105640Z   at org.scalatest.Suite.run(Suite.scala:1112)
2022-04-22T17:02:16.9107574Z   at org.scalatest.Suite.run$(Suite.scala:1094)
2022-04-22T17:02:16.9109163Z   at 
org.scalatest.funsuite.AnyFunSuite.org$scalatest$funsuite$AnyFunSuiteLike$$super$run(AnyFunSuite.scala:1563)
2022-04-22T17:02:16.9111219Z   at 
org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$run$1(AnyFunSuiteLike.scala:237)
2022-04-22T17:02:16.9112842Z   at 

[jira] [Commented] (SPARK-38897) DS V2 supports push down string functions

2022-04-23 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-38897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526954#comment-17526954
 ] 

Apache Spark commented on SPARK-38897:
--

User 'chenzhx' has created a pull request for this issue:
https://github.com/apache/spark/pull/36330

> DS V2 supports push down string functions
> -
>
> Key: SPARK-38897
> URL: https://issues.apache.org/jira/browse/SPARK-38897
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Zhixiong Chen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-38996) Use double quotes for types in error massages

2022-04-23 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-38996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526826#comment-17526826
 ] 

Apache Spark commented on SPARK-38996:
--

User 'MaxGekk' has created a pull request for this issue:
https://github.com/apache/spark/pull/36329

> Use double quotes for types in error massages
> -
>
> Key: SPARK-38996
> URL: https://issues.apache.org/jira/browse/SPARK-38996
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0, 3.4.0
>Reporter: Max Gekk
>Assignee: Max Gekk
>Priority: Major
> Fix For: 3.4.0
>
>
> All types should be printed in SQL style in error messages, and wrapped by 
> double quotes. For example, the type DateType should be highlighted as "DATE" 
> to make it more visible in error messages.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39003) make AppHistoryServerPlugin public api for developer

2022-04-23 Thread gabrywu (Jira)
gabrywu created SPARK-39003:
---

 Summary: make AppHistoryServerPlugin public api for developer
 Key: SPARK-39003
 URL: https://issues.apache.org/jira/browse/SPARK-39003
 Project: Spark
  Issue Type: Wish
  Components: Spark Core
Affects Versions: 3.1.0, 2.4.0, 2.3.0
Reporter: gabrywu


For history server, there is an interface called 
{{{}AppHistoryServerPlugin{}}}, which is loaded based on SPI. However it is 
accessible for spark package, so, can we change it as a public interface? With 
that, developer can extend application history.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-39002) StringEndsWith/Contains support push down to Parquet so that we can leverage dictionary filter

2022-04-23 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-39002:


Assignee: (was: Apache Spark)

> StringEndsWith/Contains support push down to Parquet so that we can leverage 
> dictionary filter
> --
>
> Key: SPARK-39002
> URL: https://issues.apache.org/jira/browse/SPARK-39002
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: EdisonWang
>Priority: Minor
>
> Support push down StringEndsWith/StringContains to Parquet so that we can 
> leverage parquet dictionary filtering



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-39002) StringEndsWith/Contains support push down to Parquet so that we can leverage dictionary filter

2022-04-23 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526790#comment-17526790
 ] 

Apache Spark commented on SPARK-39002:
--

User 'WangGuangxin' has created a pull request for this issue:
https://github.com/apache/spark/pull/36328

> StringEndsWith/Contains support push down to Parquet so that we can leverage 
> dictionary filter
> --
>
> Key: SPARK-39002
> URL: https://issues.apache.org/jira/browse/SPARK-39002
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: EdisonWang
>Priority: Minor
>
> Support push down StringEndsWith/StringContains to Parquet so that we can 
> leverage parquet dictionary filtering



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-39002) StringEndsWith/Contains support push down to Parquet so that we can leverage dictionary filter

2022-04-23 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-39002:


Assignee: Apache Spark

> StringEndsWith/Contains support push down to Parquet so that we can leverage 
> dictionary filter
> --
>
> Key: SPARK-39002
> URL: https://issues.apache.org/jira/browse/SPARK-39002
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: EdisonWang
>Assignee: Apache Spark
>Priority: Minor
>
> Support push down StringEndsWith/StringContains to Parquet so that we can 
> leverage parquet dictionary filtering



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-39002) StringEndsWith/Contains support push down to Parquet so that we can leverage dictionary filter

2022-04-23 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526791#comment-17526791
 ] 

Apache Spark commented on SPARK-39002:
--

User 'WangGuangxin' has created a pull request for this issue:
https://github.com/apache/spark/pull/36328

> StringEndsWith/Contains support push down to Parquet so that we can leverage 
> dictionary filter
> --
>
> Key: SPARK-39002
> URL: https://issues.apache.org/jira/browse/SPARK-39002
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: EdisonWang
>Priority: Minor
>
> Support push down StringEndsWith/StringContains to Parquet so that we can 
> leverage parquet dictionary filtering



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39002) StringEndsWith/Contains support push down to Parquet so that we can leverage dictionary filter

2022-04-23 Thread EdisonWang (Jira)
EdisonWang created SPARK-39002:
--

 Summary: StringEndsWith/Contains support push down to Parquet so 
that we can leverage dictionary filter
 Key: SPARK-39002
 URL: https://issues.apache.org/jira/browse/SPARK-39002
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.3.0
Reporter: EdisonWang


Support push down StringEndsWith/StringContains to Parquet so that we can 
leverage parquet dictionary filtering



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-39001) Document which options are unsupported in CSV and JSON functions

2022-04-23 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526785#comment-17526785
 ] 

Hyukjin Kwon commented on SPARK-39001:
--

cc [~revans2] and [~tgraves] FYI.

[~itholic] mind taking a look when you find some time?

> Document which options are unsupported in CSV and JSON functions
> 
>
> Key: SPARK-39001
> URL: https://issues.apache.org/jira/browse/SPARK-39001
> Project: Spark
>  Issue Type: Documentation
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> See https://github.com/apache/spark/pull/36294. Some of CSV and JSON options 
> don't work in expressions because some of them are plan-wise options like 
> parseMode = DROPMALFORMED.
> We should document that which options are not working. possibly we should 
> also throw an exception.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39001) Document which options are unsupported in CSV and JSON functions

2022-04-23 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-39001:


 Summary: Document which options are unsupported in CSV and JSON 
functions
 Key: SPARK-39001
 URL: https://issues.apache.org/jira/browse/SPARK-39001
 Project: Spark
  Issue Type: Documentation
  Components: SQL
Affects Versions: 3.3.0
Reporter: Hyukjin Kwon


See https://github.com/apache/spark/pull/36294. Some of CSV and JSON options 
don't work in expressions because some of them are plan-wise options like 
parseMode = DROPMALFORMED.

We should document that which options are not working. possibly we should also 
throw an exception.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-38955) from_csv can corrupt surrounding lines if a lineSep is in the data

2022-04-23 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-38955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-38955:
-
Fix Version/s: 3.2.2

> from_csv can corrupt surrounding lines if a lineSep is in the data
> --
>
> Key: SPARK-38955
> URL: https://issues.apache.org/jira/browse/SPARK-38955
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Robert Joseph Evans
>Assignee: Hyukjin Kwon
>Priority: Blocker
> Fix For: 3.3.0, 3.2.2
>
>
> I don't know how critical this is. I was doing some general testing to 
> understand {{from_csv}} and found that if I happen to have a {{lineSep}} in 
> the input data and I noticed that the next row appears to be corrupted. 
> {{multiLine}} does not appear to fix it. Because this is data corruption I am 
> inclined to mark this as CRITICAL or BLOCKER, but it is an odd corner case so 
> I m not going to set it myself.
> {code}
> Seq[String]("1,\n2,3,4,5","6,7,8,9,10", "11,12,13,14,15", 
> null).toDF.select(col("value"), from_csv(col("value"), 
> StructType(Seq(StructField("a", LongType), StructField("b", StringType))), 
> Map[String,String]())).show()
> +--+---+
> | value|from_csv(value)|
> +--+---+
> |   1,\n2,3,4,5|  {1, null}|
> |6,7,8,9,10|  {null, 8}|
> |11,12,13,14,15|   {11, 12}|
> |  null|   null|
> +--+---+
> {code}
> {code}
> Seq[String]("1,:2,3,4,5","6,7,8,9,10", "11,12,13,14,15", 
> null).toDF.select(col("value"), from_csv(col("value"), 
> StructType(Seq(StructField("a", LongType), StructField("b", StringType))), 
> Map[String,String]("lineSep" -> ":"))).show()
> +--+---+
> | value|from_csv(value)|
> +--+---+
> |1,:2,3,4,5|  {1, null}|
> |6,7,8,9,10|  {null, 8}|
> |11,12,13,14,15|   {11, 12}|
> |  null|   null|
> +--+---+
> {code}
> {code}
> Seq[String]("1,\n2,3,4,5","6,7,8,9,10", "11,12,13,14,15", 
> null).toDF.select(col("value"), from_csv(col("value"), 
> StructType(Seq(StructField("a", LongType), StructField("b", StringType))), 
> Map[String,String]("lineSep" -> ":"))).show()
> +--+---+
> | value|from_csv(value)|
> +--+---+
> |   1,\n2,3,4,5|   {1, \n2}|
> |6,7,8,9,10| {6, 7}|
> |11,12,13,14,15|   {11, 12}|
> |  null|   null|
> +--+---+
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-38750) Test the error class: SECOND_FUNCTION_ARGUMENT_NOT_INTEGER

2022-04-23 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-38750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk reassigned SPARK-38750:


Assignee: panbingkun

> Test the error class: SECOND_FUNCTION_ARGUMENT_NOT_INTEGER
> --
>
> Key: SPARK-38750
> URL: https://issues.apache.org/jira/browse/SPARK-38750
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Max Gekk
>Assignee: panbingkun
>Priority: Minor
>  Labels: starter
>
> Add a test for the error classes *SECOND_FUNCTION_ARGUMENT_NOT_INTEGER* to 
> QueryCompilationErrorsSuite. The test should cover the exception throw in 
> QueryCompilationErrors:
> {code:scala}
>   def secondArgumentOfFunctionIsNotIntegerError(
>   function: String, e: NumberFormatException): Throwable = {
> // The second argument of '{function}' function needs to be an integer
> new AnalysisException(
>   errorClass = "SECOND_FUNCTION_ARGUMENT_NOT_INTEGER",
>   messageParameters = Array(function),
>   cause = Some(e))
>   }
> {code}
> For example, here is a test for the error class *UNSUPPORTED_FEATURE*: 
> https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170
> +The test must have a check of:+
> # the entire error message
> # sqlState if it is defined in the error-classes.json file
> # the error class



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-38750) Test the error class: SECOND_FUNCTION_ARGUMENT_NOT_INTEGER

2022-04-23 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-38750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk resolved SPARK-38750.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 36284
[https://github.com/apache/spark/pull/36284]

> Test the error class: SECOND_FUNCTION_ARGUMENT_NOT_INTEGER
> --
>
> Key: SPARK-38750
> URL: https://issues.apache.org/jira/browse/SPARK-38750
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Max Gekk
>Assignee: panbingkun
>Priority: Minor
>  Labels: starter
> Fix For: 3.4.0
>
>
> Add a test for the error classes *SECOND_FUNCTION_ARGUMENT_NOT_INTEGER* to 
> QueryCompilationErrorsSuite. The test should cover the exception throw in 
> QueryCompilationErrors:
> {code:scala}
>   def secondArgumentOfFunctionIsNotIntegerError(
>   function: String, e: NumberFormatException): Throwable = {
> // The second argument of '{function}' function needs to be an integer
> new AnalysisException(
>   errorClass = "SECOND_FUNCTION_ARGUMENT_NOT_INTEGER",
>   messageParameters = Array(function),
>   cause = Some(e))
>   }
> {code}
> For example, here is a test for the error class *UNSUPPORTED_FEATURE*: 
> https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170
> +The test must have a check of:+
> # the entire error message
> # sqlState if it is defined in the error-classes.json file
> # the error class



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org