subject:"Best approach to handle accented letters"

Re: Best approach to handle accented letters

2016-10-28 Thread Alfred Newman via Digitalmars-d-learn


On Friday, 28 October 2016 at 15:08:59 UTC, Chris wrote:

On Friday, 28 October 2016 at 14:31:47 UTC, Chris wrote:

[...]


What you basically do is you pass the logic on to `map` and 
`map` applies it to each item in the range (cf. [1]):


[...]


The life is beautiful !
Thx.

Re: Best approach to handle accented letters

2016-10-28 Thread Chris via Digitalmars-d-learn


On Friday, 28 October 2016 at 14:31:47 UTC, Chris wrote:

On Friday, 28 October 2016 at 13:50:24 UTC, Alfred Newman wrote:

It boils down to something like:

if (c in _accent)
  return _accent[c];
else
  return c;

Just a normal lambda (condition true) ? yes : no;

I'd recommend you to use Marc's approach, though.


What you basically do is you pass the logic on to `map` and `map` 
applies it to each item in the range (cf. [1]):


map!(myLogic)(range);

or (more idiomatic)

range.map!(myLogic);

This is true of a lot of functions, or rather templates, in the 
Phobos standard library, especially functions in std.algorithm 
(like find [2], canFind, filter etc.). In this way, instead of 
writing for-loops with if-else statements, you pass the logic to 
be applied within the `!()`-part of the template.


// Filter the letter 'l'
auto result = "Hello, world!".filter!(a => a != 'l'); // returns 
"Heo, word!"


However, what is returned is not a string. So this won't work:

`writeln("Result is " ~ result);`

// Error: incompatible types for (("Result is ") ~ (result)): 
'string' and

// 'FilterResult!(__lambda2, string)'

It returns a `FilterResult`.

To fix this, you can either write:
`
import std.conv;
auto result = "Hello, world!".filter!(a => a != 'l').to!string;
`
which converts it into a string.

or you do this:

`
import std.array;
auto result = "Hello, world!".filter!(a => a != 'l').array;
`

Then you have a string again and

`
writeln("Result is " ~ result);
`
works.

Just bear that in mind, because you will get the above error 
sometimes. Marc's example is idiomatic D and you should become 
familiar with it asap.


void main()
{
auto str = "très élégant";
immutable accents = unicode.Diacritic;
auto removed = str
// normalize each character
.normalize!NFD
// replace each diacritic with its non-diacritic 
counterpart

.filter!(c => !accents[c])
// convert each item in FilterResult back to string.
.to!string;
writeln(removed);  // prints "tres elegant"
}

[1] http://dlang.org/phobos/std_algorithm_iteration.html#.map
[1] http://dlang.org/phobos/std_algorithm_searching.html#.find

Re: Best approach to handle accented letters

2016-10-28 Thread Chris via Digitalmars-d-learn


On Friday, 28 October 2016 at 13:50:24 UTC, Alfred Newman wrote:

On Friday, 28 October 2016 at 11:40:37 UTC, Chris wrote:

[...]


@Chris

As a new guy in the D community, I am not sure, but I think the 
line below is something like a Python's lambda, right ?


auto removed = to!string(str.map!(a => (a in _accent) ? 
_accent[a] : a));


Can you please rewrite the line in a more didatic way ? Sorry, 
but I'm still learning the basics.


Thanks in advance


It boils down to something like:

if (c in _accent)
  return _accent[c];
else
  return c;

Just a normal lambda (condition true) ? yes : no;

I'd recommend you to use Marc's approach, though.

Re: Best approach to handle accented letters

2016-10-28 Thread Alfred Newman via Digitalmars-d-learn


On Friday, 28 October 2016 at 11:40:37 UTC, Chris wrote:

On Friday, 28 October 2016 at 11:24:28 UTC, Alfred Newman wrote:

Hello,

I'm getting some troubles to replace the accented letters in a 
given string with their unaccented counterparts.


Let's say I have the following input string "très élégant" and 
I need to create a function to return just "tres elegant". 
Considering we need to take care about unicode chars, what is 
the best way to write a D code to handle that ?


Cheers


You could try something like this. It works for accents. I 
haven't tested it on other characters yet.


import std.stdio;
import std.algorithm;
import std.array;
import std.conv;

enum
{
  dchar[dchar] _accent = ['á':'a', 'é':'e', 'è':'e', 'í':'i', 
'ó':'o', 'ú':'u', 'Á':'A', 'É':'E', 'Í':'I', 'Ó':'O', 'Ú':'U']

}

void main()
{
  auto str = "très élégant";
  auto removed = to!string(str.map!(a => (a in _accent) ? 
_accent[a] : a));

  writeln(removed);  // prints "tres elegant"
}


@Chris

As a new guy in the D community, I am not sure, but I think the 
line below is something like a Python's lambda, right ?


auto removed = to!string(str.map!(a => (a in _accent) ? 
_accent[a] : a));


Can you please rewrite the line in a more didatic way ? Sorry, 
but I'm still learning the basics.


Thanks in advance

Re: Best approach to handle accented letters

2016-10-28 Thread Chris via Digitalmars-d-learn


On Friday, 28 October 2016 at 12:52:04 UTC, Marc Schütz wrote:

On Friday, 28 October 2016 at 11:24:28 UTC, Alfred Newman wrote:

[...]


import std.stdio;
import std.algorithm;
import std.uni;
import std.conv;

void main()
{
auto str = "très élégant";
immutable accents = unicode.Diacritic;
auto removed = str
.normalize!NFD
.filter!(c => !accents[c])
.to!string;
writeln(removed);  // prints "tres elegant"
}

This first decomposes all characters into base and diacritic, 
and then removes the latter.


Cool. That looks pretty neat and it should cover all cases.

Re: Best approach to handle accented letters

2016-10-28 Thread Marc Schütz via Digitalmars-d-learn


On Friday, 28 October 2016 at 11:24:28 UTC, Alfred Newman wrote:

Hello,

I'm getting some troubles to replace the accented letters in a 
given string with their unaccented counterparts.


Let's say I have the following input string "très élégant" and 
I need to create a function to return just "tres elegant". 
Considering we need to take care about unicode chars, what is 
the best way to write a D code to handle that ?


Cheers


import std.stdio;
import std.algorithm;
import std.uni;
import std.conv;

void main()
{
auto str = "très élégant";
immutable accents = unicode.Diacritic;
auto removed = str
.normalize!NFD
.filter!(c => !accents[c])
.to!string;
writeln(removed);  // prints "tres elegant"
}

This first decomposes all characters into base and diacritic, and 
then removes the latter.

Re: Best approach to handle accented letters

2016-10-28 Thread Chris via Digitalmars-d-learn


On Friday, 28 October 2016 at 11:24:28 UTC, Alfred Newman wrote:

Hello,

I'm getting some troubles to replace the accented letters in a 
given string with their unaccented counterparts.


Let's say I have the following input string "très élégant" and 
I need to create a function to return just "tres elegant". 
Considering we need to take care about unicode chars, what is 
the best way to write a D code to handle that ?


Cheers


You could try something like this. It works for accents. I 
haven't tested it on other characters yet.


import std.stdio;
import std.algorithm;
import std.array;
import std.conv;

enum
{
  dchar[dchar] _accent = ['á':'a', 'é':'e', 'è':'e', 'í':'i', 
'ó':'o', 'ú':'u', 'Á':'A', 'É':'E', 'Í':'I', 'Ó':'O', 'Ú':'U']

}

void main()
{
  auto str = "très élégant";
  auto removed = to!string(str.map!(a => (a in _accent) ? 
_accent[a] : a));

  writeln(removed);  // prints "tres elegant"
}

Best approach to handle accented letters

2016-10-28 Thread Alfred Newman via Digitalmars-d-learn


Hello,

I'm getting some troubles to replace the accented letters in a 
given string with their unaccented counterparts.


Let's say I have the following input string "très élégant" and I 
need to create a function to return just "tres elegant". 
Considering we need to take care about unicode chars, what is the 
best way to write a D code to handle that ?


Cheers

Re: Best approach to handle accented letters

Re: Best approach to handle accented letters

Re: Best approach to handle accented letters

Re: Best approach to handle accented letters

Re: Best approach to handle accented letters

Re: Best approach to handle accented letters

Re: Best approach to handle accented letters

Best approach to handle accented letters

8 matches

Site Navigation

Mail list logo

Footer information