Re: mail-extract-address-components bug

2007-05-22 Thread Richard Stallman
Thank you for the reply.  But I don't have a write access to the
Emacs CVS, so I ask someone to install it.

Please email the patch and change log to emacs-devel
and ask someone to install it.


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: mail-extract-address-components bug

2007-05-22 Thread Katsumi Yamaoka
 In [EMAIL PROTECTED]
   Richard Stallman [EMAIL PROTECTED] wrote:

 Thank you for the reply.  But I don't have a write access to the
 Emacs CVS, so I ask someone to install it.

 Please email the patch and change log to emacs-devel
 and ask someone to install it.

It has already been committed thanks to Kenichi Handa.

Regards,


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: mail-extract-address-components bug

2007-05-21 Thread Richard Stallman
Please install your patch in the trunk.  If it proves correct,
we can install it in 22.2.


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: mail-extract-address-components bug

2007-05-20 Thread Katsumi Yamaoka
 In [EMAIL PROTECTED] Richard Stallman wrote:
 The way I posted, to make the syntax of all non-ASCII characters
 `word', has a weakness.  It is not effective to charsets that
 are created after loading mail-extr.el.

 So, what about the first proposed fix?
 The one that patched this particular function?

I want someone to verify, whichever is used.  Although I believe
the first patch[1] does the right thing, is harmless, and can be
used also in the emacs-unicode-2 branch, it is possible to make
a mistake in any change and it might be able to improve further.
I still think it is easier to verify the first patch than the
second one.  If no one turns up, I want the fix (whichever) to
be installed in the trunk and the emacs-unicode-2 branch, not
the EMACS_22_BASE branch.

Regards,

[1] [EMAIL PROTECTED], [EMAIL PROTECTED]


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: mail-extract-address-components bug

2007-05-18 Thread Katsumi Yamaoka
 In [EMAIL PROTECTED] Kenichi Handa wrote:
 In article [EMAIL PROTECTED], Katsumi Yamaoka [EMAIL PROTECTED] writes:

 The way I posted, to make the syntax of all non-ASCII characters
 `word', has a weakness.  It is not effective to charsets that
 are created after loading mail-extr.el.

 As a syntax table is just a char-table, we can set the
 default value of a syntax table to word.  But, I don't
 think we should do such a thing now.

I think we must study mail-extr.el entirely in order to verify
the rightfulness of the change(s) if we modify the syntax tables.
It is certain that it takes time to do it no matter who does.
OTOH, it seems to be easier to verify the patch I proposed first.
Because what we should do in that case is only to verify the code
recognizes non-ASCII characters as `word'.

Of course, I can agree not to do it in Emacs 22.1.  Such
problems will probably seldom happen, except for my colleague
(who often receives mails having funny headers).  And now we
have a workaround which doesn't require modifying of mail-extr.el.

 In emacs-unicode-2, all characters (0..0x3) exists from
 the start.  Creating a charset means just to add a mapping
 rule between the code-point in the charset and Emacs'
 character.  And, the argument CHAR of modify-syntax-entry
 can be a cons (MIN-CHAR . MAX-CHAR).

 So, we can make a syntax table, modify it for all characters
 to be word, then modify it for each special characters.

Thank you for the information.  I will be able to make a
workaround also for Emacs 23 if necessary. ;-)

Regards,


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: mail-extract-address-components bug

2007-05-18 Thread Richard Stallman
The way I posted, to make the syntax of all non-ASCII characters
`word', has a weakness.  It is not effective to charsets that
are created after loading mail-extr.el.

Regards,

So, what about the first proposed fix?
The one that patched this particular function?


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: mail-extract-address-components bug

2007-05-17 Thread Richard Stallman
If people have doubts that changing that syntax table is generally
correct, what about the previous patch that alters just the
particular function?


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: mail-extract-address-components bug

2007-05-17 Thread Kenichi Handa
In article [EMAIL PROTECTED], Richard Stallman [EMAIL PROTECTED] writes:

 If people have doubts that changing that syntax table is generally
 correct, what about the previous patch that alters just the
 particular function?

At the moment, I don't have a time to study the code and his
patch.  So, I can't doubt nor be sure on anything.

But, in general, I think we can treat all non-ASCII
characters as the same way in a mail address.  And, if
mail-extr.el utilizes a syntax table to parse a mail
address, I think setting all non-ASCII characters to the
same syntax (in the current case, word) is the right thing.

So, my suggestion it to do that and see if it works well if
no one can investigate the code and RFC-822.

---
Kenichi Handa
[EMAIL PROTECTED]


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: mail-extract-address-components bug

2007-05-17 Thread Stefan Monnier
 think it needs time to be tested widely.  I'm not well informed
 about RFC2822 and what mail-extr.el does (especially the voodoo
 operation, which is disabled by default for Japanese names), too.

RFC2822 only talks about the transmission-format, which is made of bytes,
not chars.  It's probably OK to treat any non-byte char as
a word-constituent.


Stefan not sure what means `non-byte' on the unicode branch


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: mail-extract-address-components bug

2007-05-17 Thread Katsumi Yamaoka
 In [EMAIL PROTECTED] Kenichi Handa wrote:
 In article [EMAIL PROTECTED], Richard Stallman [EMAIL PROTECTED] writes:

 If people have doubts that changing that syntax table is generally
 correct, what about the previous patch that alters just the
 particular function?

 At the moment, I don't have a time to study the code and his
 patch.  So, I can't doubt nor be sure on anything.

 But, in general, I think we can treat all non-ASCII
 characters as the same way in a mail address.  And, if
 mail-extr.el utilizes a syntax table to parse a mail
 address, I think setting all non-ASCII characters to the
 same syntax (in the current case, word) is the right thing.

 So, my suggestion it to do that and see if it works well if
 no one can investigate the code and RFC-822.

The way I posted, to make the syntax of all non-ASCII characters
`word', has a weakness.  It is not effective to charsets that
are created after loading mail-extr.el.

Regards,


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: mail-extract-address-components bug

2007-05-17 Thread Kenichi Handa
In article [EMAIL PROTECTED], Katsumi Yamaoka [EMAIL PROTECTED] writes:

  So, my suggestion it to do that and see if it works well if
  no one can investigate the code and RFC-822.

 The way I posted, to make the syntax of all non-ASCII characters
 `word', has a weakness.  It is not effective to charsets that
 are created after loading mail-extr.el.

As a syntax table is just a char-table, we can set the
default value of a syntax table to word.  But, I don't
think we should do such a thing now.

In emacs-unicode-2, all characters (0..0x3) exists from
the start.  Creating a charset means just to add a mapping
rule between the code-point in the charset and Emacs'
character.  And, the argument CHAR of modify-syntax-entry
can be a cons (MIN-CHAR . MAX-CHAR).

So, we can make a syntax table, modify it for all characters
to be word, then modify it for each special characters.

---
Kenichi Handa
[EMAIL PROTECTED]


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: mail-extract-address-components bug

2007-05-16 Thread Katsumi Yamaoka
 In [EMAIL PROTECTED] Kenichi Handa wrote:
 In article [EMAIL PROTECTED], Katsumi Yamaoka [EMAIL PROTECTED] writes:

 [...] and the patch[1] I posted first is better.
 But please don't merge it hastily if you have a doubt even if it
 is little.

 I don't know about the code of mail-extr.el nor the detailed
 format of a mail address (RFC-822?).  So, I can't be sure,
 either.  It must be checked by someone who knows both of
 them.

I'm anxious for such a person.  Otherwise...

 Or, should we just install the change and see what happens?

I hope to install the patch to only the Emacs CVS trunk, since I
think it needs time to be tested widely.  I'm not well informed
about RFC2822 and what mail-extr.el does (especially the voodoo
operation, which is disabled by default for Japanese names), too.

Regards,


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: mail-extract-address-components bug

2007-05-15 Thread Kenichi Handa
Sorry for the late response on this matter.

In article [EMAIL PROTECTED], Katsumi Yamaoka [EMAIL PROTECTED] writes:

 In [EMAIL PROTECTED] Katsumi Yamaoka wrote:
  I hope this is fixed in Emacs 22.2 (or possibly 22.1).  I've been
  annoyed that `mail-extract-address-component' sometimes fails to
  parse addresses containing non-ASCII names correctly.  Japanese
  people often use their native names in the From header.  Since
  they sometimes use non-ASCII letters which are not specified as
  words in the syntax table, Gnus, for example, fails to build the
  recipient address when replying.

 [...]

  The causes are

 [...]

  and `m-e-a-c' uses `forward-word' to try to skip them even if
  they are not words.

 I found another solution, which doesn't need to modify mail-extr.el:

 ;; Set the syntax of all non-ASCII characters to `word'
 ;; in the syntax tables that mail-extr.el uses.

If it solves the problem, should mail-extr.el be modified to
setup syntax tables as you did below?

 (eval-after-load mail-extr
   '(let ((tables '(mail-extr-address-syntax-table
  mail-extr-address-comment-syntax-table
  mail-extr-address-domain-literal-syntax-table
  mail-extr-address-text-comment-syntax-table
  mail-extr-address-text-syntax-table))
table charsets generic-char)
  (while tables
(setq table (symbol-value (car tables))
tables (cdr tables)
charsets charset-list)
(while charsets
(setq generic-char (make-char (car charsets))
  charsets (cdr charsets))
(if (= generic-char 128)
(modify-syntax-entry generic-char w table))

 This form also modifies the syntax of the Latin-1 nbsp character
 to `word' but it doesn't seem to be a problem.

 BTW, I think it should be documented that `modify-syntax-entry'
 allows the generic character of a charset as the first argument,
 as it is mentioned in the doc string of `make-char'.

I'm not sure.  It surely accepts a generic character now,
but the concept of generic character is deleted in
emacs-unicode-2 branch (and in comming Emacs 23).  So even
if we describe that feature now, it should be reverted quite
soon.

---
Kenichi Handa
[EMAIL PROTECTED]


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: mail-extract-address-components bug

2007-05-15 Thread Katsumi Yamaoka
 In [EMAIL PROTECTED] Kenichi Handa wrote:

 Sorry for the late response on this matter.

Please don't mind, since the most important thing now is to release
Emacs 22.1.

 In article [EMAIL PROTECTED], Katsumi Yamaoka [EMAIL PROTECTED] writes:

 I hope this is fixed in Emacs 22.2 (or possibly 22.1).  I've been
 annoyed that `mail-extract-address-component' sometimes fails to
 parse addresses containing non-ASCII names correctly.  Japanese
 people often use their native names in the From header.  Since
 they sometimes use non-ASCII letters which are not specified as
 words in the syntax table, Gnus, for example, fails to build the
 recipient address when replying.

 [...]

 The causes are

 [...]

 and `m-e-a-c' uses `forward-word' to try to skip them even if
 they are not words.

 I found another solution, which doesn't need to modify mail-extr.el:

 ;; Set the syntax of all non-ASCII characters to `word'
 ;; in the syntax tables that mail-extr.el uses.

 If it solves the problem, should mail-extr.el be modified to
 setup syntax tables as you did below?

 (eval-after-load mail-extr

I don't want it to be merged in mail-extr.el.  It influences all
the mail-extr processes but I'm not sure whether setting the syntax
of all non-ASCII characters to `word' does not have a side effect.
I consider it is no more than a workaround to be used before
`m-e-a-c' is fixed, and the patch[1] I posted first is better.
But please don't merge it hastily if you have a doubt even if it
is little.

[1] It has to be fixed slightly.  See [EMAIL PROTECTED].

 BTW, I think it should be documented that `modify-syntax-entry'
 allows the generic character of a charset as the first argument,
 as it is mentioned in the doc string of `make-char'.

 I'm not sure.  It surely accepts a generic character now,
 but the concept of generic character is deleted in
 emacs-unicode-2 branch (and in comming Emacs 23).  So even
 if we describe that feature now, it should be reverted quite
 soon.

I see.  I withdraw the request.

Thanks.


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: mail-extract-address-components bug

2007-05-14 Thread Katsumi Yamaoka
 In [EMAIL PROTECTED] Katsumi Yamaoka wrote:

 *** mail-extr.el~ Sun Jan 21 21:57:52 2007
 --- mail-extr.el  Mon May 14 03:16:51 2007

[...]

 !(not (eq (char-after ? ))

Sorry, this line has to be corrected into:

!  (not (eq (char-after) ? )


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


mail-extract-address-components bug

2007-05-13 Thread Katsumi Yamaoka
Hi,

I hope this is fixed in Emacs 22.2 (or possibly 22.1).  I've been
annoyed that `mail-extract-address-component' sometimes fails to
parse addresses containing non-ASCII names correctly.  Japanese
people often use their native names in the From header.  Since
they sometimes use non-ASCII letters which are not specified as
words in the syntax table, Gnus, for example, fails to build the
recipient address when replying.  Here are examples:

(mail-extract-address-components 阿部 晋三  [EMAIL PROTECTED])
 = (阿部 晋三  Shinzo Abe 阿部 晋三  [EMAIL PROTECTED] )

(mail-extract-address-components 阿部 晋三 ◆ [EMAIL PROTECTED])
 = (阿部 晋三 ◆ Shinzo Abe 阿部 晋三 ◆ [EMAIL PROTECTED] )

(mail-extract-address-components
 阿部 晋三  (しんちゃん) [EMAIL PROTECTED])
 = (阿部 晋三  (しんちゃん [EMAIL PROTECTED])

The causes are   (wide space) and ◆ in the name portion
and `m-e-a-c' uses `forward-word' to try to skip them even if
they are not words.  I tried fixing of this problem as follows:

(NOTE: it contains Latin-1 nbsp characters encoded with utf-8.)
*** mail-extr.el~   Sun Jan 21 21:57:52 2007
--- mail-extr.elMon May 14 03:16:51 2007
***
*** 873,879 
  (mail-extr-nuke-char-at (point))
  (forward-char 1))
 (t
! (forward-word 1)))
(or (eq char ?\()
;; At the end of first address of a multiple address header.
(and (eq char ?,)
--- 873,889 
  (mail-extr-nuke-char-at (point))
  (forward-char 1))
 (t
! ;; Do `(forward-word 1)', recognizing non-ASCII characters
! ;; except Latin-1 nbsp as words.
! (while (progn
!  (skip-chars-forward ^\000-\177 )
!  (and (not (eobp))
!   (eq ?w (char-syntax (char-after)))
!   (progn
! (forward-word 1)
! (and (not (eobp))
!  ( (char-after) ?\177)
!  (not (eq (char-after ? ))
(or (eq char ?\()
;; At the end of first address of a multiple address header.
(and (eq char ?,)

I appreciate someone looking into it.

Regards,
___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug