[ 
https://issues.apache.org/jira/browse/SPARK-33064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laurent GUEMAPPE updated SPARK-33064:
-------------------------------------
    Description: 
It seems to be a duplicate of *FLEX-18425*, which is duplicate of SDK-17398 
that does not exist anymore. But the bug remains.

(1) I create a txt file "café.txt" that contains two lines : 
{quote}Café

Café
{quote}
(2) I type the following command :

*spark.read.csv("café.txt").show()*

It is displayed as following :

*spark.read.csv("caf.txt").show()*

But it works and it returns this : 
{quote}+-----+
  |   _c0|
 +-----+
  |  Caf|
  |Café|
 +-----+
{quote}
We notice a shift after "Caf" and "Café".

(3) The two following commands works. The written textfiles have the same 
content as "café.txt" 

*spark.read.csv("café.txt").write.format("text").save("café2")*

*sc.textFile("café.txt").saveAsTextFile("café3")*

 

Once again, the Spark-shell display this : 

*spark.read.csv("caf.txt").write.format("text").save("caf2")*

*sc.textFile("caf.txt").saveAsTextFile("caf3")*

 

(4)If I type 7 "é" an then 7 Backspace, by using the "é" key of my french 
keyboard, then the scala prompt disappears. I have a new prompt when I type 
Return.

 

The issue (4) as well as the shift in (2) seem to be related to the difference 
between counted characters and displayed characters.

 

(5) I notice that I haven't got this issue by launching Spark from Ubuntu, 
thanks to "Windows Subsystem for Linux" Version 2.

  was:
It seems to be a duplicate of *FLEX-18425*, which is duplicate of SDK-17398 
that does not exist anymore. But the bug remains.

(1) I create a txt file "café.txt" that contains two lines : 
{quote}Café

Café
{quote}
(2) I type the following command :

*spark.read.csv("café.txt").show()*

It is displayed as following :

*spark.read.csv("caf.txt").show()*

But it works and it returns this : 
{quote}+-----+
 |   _c0|
+-----+
 |  Caf|
 |Café|
+-----+
{quote}
We notice a shift after "Caf" and "Café".

(3) The two following commands works. The written textfiles have the same 
content as "café.txt" 

*spark.read.csv("café.txt").write.format("text").save("café2")*

*sc.textFile("café.txt").saveAsTextFile("café3")*

 

Once again, the Spark-shell display this : 

*spark.read.csv("caf.txt").write.format("text").save("caf2")*

*sc.textFile("caf.txt").saveAsTextFile("caf3")*

 

(4)If I type 7 "é" an then 7 Backspace, by using the "é" key of my french 
keyboard, then the scala prompt disappears. I have a new prompt when I type 
Return.

 

The issue (4) as well as the shift in (2) seem to be related to the difference 
between counted characters and displayed characters.


> Spark-shell does not display accented chara
> -------------------------------------------
>
>                 Key: SPARK-33064
>                 URL: https://issues.apache.org/jira/browse/SPARK-33064
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Shell
>    Affects Versions: 3.0.1
>         Environment: Windows 10
> "Beta: Use Unicode UTF-8 for worldwide language support" has been checked.
>            Reporter: Laurent GUEMAPPE
>            Priority: Minor
>
> It seems to be a duplicate of *FLEX-18425*, which is duplicate of SDK-17398 
> that does not exist anymore. But the bug remains.
> (1) I create a txt file "café.txt" that contains two lines : 
> {quote}Café
> Café
> {quote}
> (2) I type the following command :
> *spark.read.csv("café.txt").show()*
> It is displayed as following :
> *spark.read.csv("caf.txt").show()*
> But it works and it returns this : 
> {quote}+-----+
>   |   _c0|
>  +-----+
>   |  Caf|
>   |Café|
>  +-----+
> {quote}
> We notice a shift after "Caf" and "Café".
> (3) The two following commands works. The written textfiles have the same 
> content as "café.txt" 
> *spark.read.csv("café.txt").write.format("text").save("café2")*
> *sc.textFile("café.txt").saveAsTextFile("café3")*
>  
> Once again, the Spark-shell display this : 
> *spark.read.csv("caf.txt").write.format("text").save("caf2")*
> *sc.textFile("caf.txt").saveAsTextFile("caf3")*
>  
> (4)If I type 7 "é" an then 7 Backspace, by using the "é" key of my french 
> keyboard, then the scala prompt disappears. I have a new prompt when I type 
> Return.
>  
> The issue (4) as well as the shift in (2) seem to be related to the 
> difference between counted characters and displayed characters.
>  
> (5) I notice that I haven't got this issue by launching Spark from Ubuntu, 
> thanks to "Windows Subsystem for Linux" Version 2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to