[jira] [Updated] (SPARK-25108) Dataset.show() generates incorrect padding for Unicode Character

xuejianbest (JIRA) Mon, 13 Aug 2018 19:36:52 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-25108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


xuejianbest updated SPARK-25108:
--------------------------------
    Description: 
The Dataset.show() method generates incorrect space padding since column name 
or column value has Unicode Character.
{code:java}
val df = spark.createDataset(Seq(
"γύρος",
"pears",
"linguiça",
"xoriço",
"hamburger",
"éclair",
"smørbrød",
"spätzle",
"包子",
"jamón serrano",
"pêches",
"シュークリーム",
"막걸리",
"寿司",
"おもち",
"crème brûlée",
"fideuà",
"pâté",
"お好み焼き")).toDF("value")

df.show
/*
+-------------+
| value|
+-------------+
| γύρος|
| pears|
| linguiça|
| xoriço|
| hamburger|
| éclair|
| smørbrød|
| spätzle|
| 包子|
|jamón serrano|
| pêches|
| シュークリーム|
| 막걸리|
| 寿司|
| おもち|
| crème brûlée|
| fideuà|
| pâté|
| お好み焼き|
+-------------+

*/{code}
 

Before and after fix, see attached pictures please .
![show](/user/desktop/doge.png)

  was:
The Dataset.show() method generates incorrect space padding since column name 
or column value has Unicode Character.
{code:java}
val df = spark.createDataset(Seq(
"γύρος",
"pears",
"linguiça",
"xoriço",
"hamburger",
"éclair",
"smørbrød",
"spätzle",
"包子",
"jamón serrano",
"pêches",
"シュークリーム",
"막걸리",
"寿司",
"おもち",
"crème brûlée",
"fideuà",
"pâté",
"お好み焼き")).toDF("value")

df.show
/*
+-------------+
| value|
+-------------+
| γύρος|
| pears|
| linguiça|
| xoriço|
| hamburger|
| éclair|
| smørbrød|
| spätzle|
| 包子|
|jamón serrano|
| pêches|
| シュークリーム|
| 막걸리|
| 寿司|
| おもち|
| crème brûlée|
| fideuà|
| pâté|
| お好み焼き|
+-------------+

*/{code}
 

Before and after fix, see attached pictures please .


> Dataset.show() generates incorrect padding for Unicode Character
> ----------------------------------------------------------------
>
>                 Key: SPARK-25108
>                 URL: https://issues.apache.org/jira/browse/SPARK-25108
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.3.0, 2.3.1
>         Environment: spark-shell on Xshell5
>            Reporter: xuejianbest
>            Priority: Critical
>         Attachments: show.bmp
>
>
> The Dataset.show() method generates incorrect space padding since column name 
> or column value has Unicode Character.
> {code:java}
> val df = spark.createDataset(Seq(
> "γύρος",
> "pears",
> "linguiça",
> "xoriço",
> "hamburger",
> "éclair",
> "smørbrød",
> "spätzle",
> "包子",
> "jamón serrano",
> "pêches",
> "シュークリーム",
> "막걸리",
> "寿司",
> "おもち",
> "crème brûlée",
> "fideuà",
> "pâté",
> "お好み焼き")).toDF("value")
> df.show
> /*
> +-------------+
> | value|
> +-------------+
> | γύρος|
> | pears|
> | linguiça|
> | xoriço|
> | hamburger|
> | éclair|
> | smørbrød|
> | spätzle|
> | 包子|
> |jamón serrano|
> | pêches|
> | シュークリーム|
> | 막걸리|
> | 寿司|
> | おもち|
> | crème brûlée|
> | fideuà|
> | pâté|
> | お好み焼き|
> +-------------+
> */{code}
>  
> Before and after fix, see attached pictures please .
> ![show](/user/desktop/doge.png)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SPARK-25108) Dataset.show() generates incorrect padding for Unicode Character

Reply via email to