[
https://issues.apache.org/jira/browse/SPARK-23813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cui Xixin updated SPARK-23813:
------------------------------
Description:
i am working on replacing hivesql with sparksql now, i found parse_url perform
differently,
for example:
"select parse_url('http://spark.apache.org/path?query=1 &test=1','HOST') from
dual limit 1"
in hive,the result is "spark.apache.org" ,but in sparksql is "null",
then i fount SPARK-16826,
https://issues.apache.org/jira/browse/SPARK-16826 ,the implementation has been
changed for better performence, but also lead to the difference. in fact, the
main reason is the 'space' after "query=1", so if we can fix it, for example,
encode the sql before new URI() and decode the result?
was:
i am working on replacing hivesql with sparksql now, i fount parse_url perform
differently,
for example:
"select parse_url('http://spark.apache.org/path?query=1 &test=1','HOST') from
dual limit 1"
in hive,the result is "spark.apache.org" ,but in sparksql is "null", then
i fount SPARK-16826,
https://issues.apache.org/jira/browse/SPARK-16826 ,the implementation has been
changed for better performence, but also lead to the difference. in fact, the
main reason is the 'space' after "query=1", so if we can fix it, for example,
encode the sql before new URI() and decode the result?
> [SparkSQL] the result is different between hive and spark when use
> PARSE_URL()
> -------------------------------------------------------------------------------
>
> Key: SPARK-23813
> URL: https://issues.apache.org/jira/browse/SPARK-23813
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 2.3.0
> Reporter: Cui Xixin
> Priority: Minor
>
> i am working on replacing hivesql with sparksql now, i found parse_url
> perform differently,
> for example:
> "select parse_url('http://spark.apache.org/path?query=1 &test=1','HOST')
> from dual limit 1"
> in hive,the result is "spark.apache.org" ,but in sparksql is "null",
> then i fount SPARK-16826,
> https://issues.apache.org/jira/browse/SPARK-16826 ,the implementation has
> been changed for better performence, but also lead to the difference. in
> fact, the main reason is the 'space' after "query=1", so if we can fix it,
> for example, encode the sql before new URI() and decode the result?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]