Re: Questions about Unicode-aware C programs under Linux

Ali Majdzadeh Tue, 17 Apr 2007 03:56:44 -0700

Hi Rich
Sorry. I managed to solve the problem. You were right.
Of course, there are only some minor problems regarding that string literals
do not match exactly with those strings read from a file, thus string
comparison functions fail to operate. I am going to investigate on it.
Thanks a lot


Best Regards
Ali

On 4/17/07, Ali Majdzadeh <[EMAIL PROTECTED]> wrote:


Hello Rich
Sorry, again.
I wrote a simple C program using your guidelines but unfortunately it does
not work well:
The program is as follows:

#include        <stdio.h>
#include        <errno.h>
#include        <stdlib.h>
#include        <string.h>
#include        <locale.h>
#include        <langinfo.h>


int     main    (
                        int     argc,
                        char    *argv[]
                )
{
        FILE    *input_file;

        char    buffer[1024];

        if (!setlocale (LC_CTYPE, ""))
        {
                fprintf (stderr, "Locale not specified. Check LC_ALL,
LC_CTYPE or LANG.\n");
                return  EXIT_FAILURE;
        }

        if (!(input_file = fopen ("./in.txt", "r")))
        {
                fprintf (stderr, "Could not open file : %s\n", strerror
(errno));
                return  EXIT_FAILURE;
        }

        fgets (buffer, sizeof (buffer), input_file);
        fprintf (stdout, "%s", buffer);

        return  EXIT_SUCCESS;
}

The program does not print the line read from the file to stdout (some
junks are printed). I also used "cat ./persian.txt | iconv -t utf-8 >
in.txt" to produce a UTF-8 oriented file.

Best Regards
Ali

On 4/17/07, Rich Felker <[EMAIL PROTECTED] > wrote:
>
> On Tue, Apr 17, 2007 at 10:46:44AM +0430, Ali Majdzadeh wrote:
> > Hello Rich
> > Thanks for your response.
> > About your question, I should say "yes", I need some text processing
> > capabilities.
>
> OK.
>
> > Do you mean that I should use common stdio functions? (like, fgets(),
> ...)
>
> Yes, they'll work fine.
>
> > And what about UTF-8 strings? Do you mean that these strings should be
> > stored in common char*
>
> Yes.
>
> > variables? So, what about the character size defference (Unicode and
> ASCII)?
> > And also, string functions? (like, strtok())
>
> strtok, strsep, strchr, strrchr, strpbrk, strspn, and strcspn will all
> work just fine on UTF-8 strings as long as the separator characters
> you're looking for are ASCII.
>
> strstr always works on UTF-8, and can be used in place of strchr to
> search for single non-ascii characters or longer substrings.
>
> Rich
>
> --
> Linux-UTF8:   i18n of Linux on all levels
> Archive:       http://mail.nl.linux.org/linux-utf8/
>
>

Re: Questions about Unicode-aware C programs under Linux

Reply via email to