[
https://issues.apache.org/jira/browse/SPARK-19796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893376#comment-15893376
]
Kay Ousterhout commented on SPARK-19796:
----------------------------------------
Do you think we should (separately) fix the underlying problem? Specifically,
we could:
(a) not send the SPARK_JOB_DESCRIPTION property to the workers, since it's only
used on the master for the UI (and while users *could* access it, the variable
name SPARK_JOB_DESCRIPTION is spark-private, which suggests that it shouldn't
be used by users). Perhaps this is too risky because users could be using it?
(b) Truncate SPARK_JOB_DESCRIPTION to something reasonable (100 characters?)
before sending it to the workers. This is more backwards compatible if users
are actually reading the property, but maybe a useless intermediate approach?
(c) (Possibly in addition to one of the above) Log a warning if any of the
properties is longer than 100 characters (or some threshold).
Thoughts? I can file a JIRA if you think any of these is worthwhile.
> taskScheduler fails serializing long statements received by thrift server
> -------------------------------------------------------------------------
>
> Key: SPARK-19796
> URL: https://issues.apache.org/jira/browse/SPARK-19796
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.2.0
> Reporter: Giambattista
> Priority: Blocker
>
> This problem was observed after the changes made for SPARK-17931.
> In my use-case I'm sending very long insert statements to Spark thrift server
> and they are failing at TaskDescription.scala:89 because writeUTF fails if
> requested to write strings longer than 64Kb (see
> https://www.drillio.com/en/2009/java-encoded-string-too-long-64kb-limit/ for
> a description of the issue).
> As suggested by Imran Rashid I tracked down the offending key: it is
> "spark.job.description" and it contains the complete SQL statement.
> The problem can be reproduced by creating a table like:
> create table test (a int) using parquet
> and by sending an insert statement like:
> scala> val r = 1 to 128000
> scala> println("insert into table test values (" + r.mkString("),(") + ")")
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]