dongjoon-hyun commented on code in PR #45408:
URL: https://github.com/apache/spark/pull/45408#discussion_r1518435444
##
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##
@@ -3318,6 +3318,13 @@ object SQLConf {
.booleanConf
dongjoon-hyun commented on code in PR #45408:
URL: https://github.com/apache/spark/pull/45408#discussion_r1518435595
##
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##
@@ -3318,6 +3318,13 @@ object SQLConf {
.booleanConf
dongjoon-hyun commented on code in PR #45408:
URL: https://github.com/apache/spark/pull/45408#discussion_r1518435187
##
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##
@@ -3318,6 +3318,13 @@ object SQLConf {
.booleanConf
dongjoon-hyun commented on code in PR #45408:
URL: https://github.com/apache/spark/pull/45408#discussion_r1518433859
##
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##
@@ -3318,6 +3318,13 @@ object SQLConf {
.booleanConf
ted-jenks commented on code in PR #45408:
URL: https://github.com/apache/spark/pull/45408#discussion_r1517931178
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala:
##
@@ -2426,21 +2426,34 @@ case class Chr(child: Expression)
cloud-fan commented on code in PR #45408:
URL: https://github.com/apache/spark/pull/45408#discussion_r1517826436
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala:
##
@@ -2426,21 +2426,34 @@ case class Chr(child: Expression)
cloud-fan commented on code in PR #45408:
URL: https://github.com/apache/spark/pull/45408#discussion_r1517664214
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala:
##
@@ -2426,21 +2426,27 @@ case class Chr(child: Expression)
ted-jenks commented on PR #45408:
URL: https://github.com/apache/spark/pull/45408#issuecomment-1985424230
I think making this configurable makes the most sense. For people processing
data for external systems with Spark they can choose to chunk or not chunk data
based on what the use-case
yaooqinn commented on PR #45408:
URL: https://github.com/apache/spark/pull/45408#issuecomment-1984930529
As the Spark Community didn't get any issue report during v3.3.0 - v3.5.1
releases, I think this is a corner case. Maybe we can make the config internal.
--
This is an automated
dongjoon-hyun commented on PR #45408:
URL: https://github.com/apache/spark/pull/45408#issuecomment-1984929745
+1 for the direction if we need to support both.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
yaooqinn commented on PR #45408:
URL: https://github.com/apache/spark/pull/45408#issuecomment-1984926315
Thank you @dongjoon-hyun.
In such circumstances, I guess we can add a configuration for base64 classes
to avoid breaking things again. AFAIK, Apache Hive also uses the JDK
dongjoon-hyun commented on PR #45408:
URL: https://github.com/apache/spark/pull/45408#issuecomment-1984433848
Thank you for the confirmation, @ted-jenks . Well, in this case, it's too
late to change the behavior again. Apache Spark 3.3 is already the EOL status
since last year and I don't
ted-jenks commented on PR #45408:
URL: https://github.com/apache/spark/pull/45408#issuecomment-1982836579
@dongjoon-hyun
> It sounds like you have other systems to read Spark's data.
Correct. The issue was that from 3.2 to 3.3 there was a behavior change in
the base64 encodings used
dongjoon-hyun commented on PR #45408:
URL: https://github.com/apache/spark/pull/45408#issuecomment-1981829489
Hi, @ted-jenks . Could you elaborate your correctness situation a little
more? It sounds like you have other systems to read Spark's data.
--
This is an automated message from
ted-jenks commented on PR #45408:
URL: https://github.com/apache/spark/pull/45408#issuecomment-1981542843
@dongjoon-hyun please may you take a look. Caused a big data correctness
issue for us.
--
This is an automated message from the Apache Git Service.
To respond to the message, please
ted-jenks opened a new pull request, #45408:
URL: https://github.com/apache/spark/pull/45408
### What changes were proposed in this pull request?
[SPARK-47307] Replace RFC 2045 base64 encoder with RFC 4648 encoder
### Why are the changes needed?
In
16 matches
Mail list logo