[
https://issues.apache.org/jira/browse/SPARK-25108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
xuejianbest updated SPARK-25108:
--------------------------------
Description:
The Dataset.show() method generates incorrect space padding since column name
or column value has Unicode Character.
{code:java}
val df = spark.createDataset(Seq(
"γύρος",
"pears",
"linguiça",
"xoriço",
"hamburger",
"éclair",
"smørbrød",
"spätzle",
"包子",
"jamón serrano",
"pêches",
"シュークリーム",
"막걸리",
"寿司",
"おもち",
"crème brûlée",
"fideuà",
"pâté",
"お好み焼き")).toDF("value")
df.show
/*
+-------------+
| value|
+-------------+
| γύρος|
| pears|
| linguiça|
| xoriço|
| hamburger|
| éclair|
| smørbrød|
| spätzle|
| 包子|
|jamón serrano|
| pêches|
| シュークリーム|
| 막걸리|
| 寿司|
| おもち|
| crème brûlée|
| fideuà|
| pâté|
| お好み焼き|
+-------------+
*/{code}
Before and after fix, see attached pictures please .

was:
The Dataset.show() method generates incorrect space padding since column name
or column value has Unicode Character.
{code:java}
val df = spark.createDataset(Seq(
"γύρος",
"pears",
"linguiça",
"xoriço",
"hamburger",
"éclair",
"smørbrød",
"spätzle",
"包子",
"jamón serrano",
"pêches",
"シュークリーム",
"막걸리",
"寿司",
"おもち",
"crème brûlée",
"fideuà",
"pâté",
"お好み焼き")).toDF("value")
df.show
/*
+-------------+
| value|
+-------------+
| γύρος|
| pears|
| linguiça|
| xoriço|
| hamburger|
| éclair|
| smørbrød|
| spätzle|
| 包子|
|jamón serrano|
| pêches|
| シュークリーム|
| 막걸리|
| 寿司|
| おもち|
| crème brûlée|
| fideuà|
| pâté|
| お好み焼き|
+-------------+
*/{code}
Before and after fix, see attached pictures please .
> Dataset.show() generates incorrect padding for Unicode Character
> ----------------------------------------------------------------
>
> Key: SPARK-25108
> URL: https://issues.apache.org/jira/browse/SPARK-25108
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 2.3.0, 2.3.1
> Environment: spark-shell on Xshell5
> Reporter: xuejianbest
> Priority: Critical
> Attachments: show.bmp
>
>
> The Dataset.show() method generates incorrect space padding since column name
> or column value has Unicode Character.
> {code:java}
> val df = spark.createDataset(Seq(
> "γύρος",
> "pears",
> "linguiça",
> "xoriço",
> "hamburger",
> "éclair",
> "smørbrød",
> "spätzle",
> "包子",
> "jamón serrano",
> "pêches",
> "シュークリーム",
> "막걸리",
> "寿司",
> "おもち",
> "crème brûlée",
> "fideuà",
> "pâté",
> "お好み焼き")).toDF("value")
> df.show
> /*
> +-------------+
> | value|
> +-------------+
> | γύρος|
> | pears|
> | linguiça|
> | xoriço|
> | hamburger|
> | éclair|
> | smørbrød|
> | spätzle|
> | 包子|
> |jamón serrano|
> | pêches|
> | シュークリーム|
> | 막걸리|
> | 寿司|
> | おもち|
> | crème brûlée|
> | fideuà|
> | pâté|
> | お好み焼き|
> +-------------+
> */{code}
>
> Before and after fix, see attached pictures please .
> 
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]