Re: Issue 905: Gracefully ignore UTF-8 BOM in the middle of a file (issue 4908043)

2011-08-24 Thread pkx166h

passes make and reg tests

http://codereview.appspot.com/4908043/

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: Issue 905: Gracefully ignore UTF-8 BOM in the middle of a file (issue 4908043)

2011-08-15 Thread lemzwerg

OK.

Please make the warning message more verbose, though.


http://codereview.appspot.com/4908043/

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: Issue 905: Gracefully ignore UTF-8 BOM in the middle of a file (issue 4908043)

2011-08-15 Thread paconet . org

On 2011/08/15 18:14:21, lemzwerg wrote:

Could you please tell me what this patch is good for?  A BOM not at

the

beginning of a file is no longer a BOM...



I don't oppose to emitting a warning if U+FEFF is encountered, and we
subsequently ignore it (since its use as zero width no-break space is
deprecated), but only within strings...



What am I missing?


The BOM is invisible and you can not be sure whether it is inside a
string or anywhere else.  In my experience, it is very common that
Windows users create/open/modify lilypond input files with the Notepad
accessory, maybe inserting new charachters at the very beginning, thus
inadvertently moving the BOM and obtaining failed compiles.  Moreover,
they are not able to fix the file because the BOM is invisible.  So, the
patch would be very useful in that makes the BOM transparent for the
lilypond syntax and the user could forget it at last.

http://codereview.appspot.com/4908043/

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: Issue 905: Gracefully ignore UTF-8 BOM in the middle of a file (issue 4908043)

2011-08-15 Thread reinhold . kainhofer

Reviewers: lemzwerg,

Message:
On 2011/08/15 18:14:21, lemzwerg wrote:

Could you please tell me what this patch is good for?  A BOM not at

the

beginning of a file is no longer a BOM...



I don't oppose to emitting a warning if U+FEFF is encountered, and we
subsequently ignore it (since its use as zero width no-break space is
deprecated), but only within strings...



What am I missing?


RFC 3629 says that U+FEFF is a zero-width non-breakable space, which is
also used as BOM. It also says:
" This character
   can be used as a genuine "ZERO WIDTH NO-BREAK SPACE" within text,"
...
"  It is important to understand that the character U+FEFF appearing at
   any position other than the beginning of a stream MUST be interpreted
   with the semantics for the zero-width non-breaking space, and MUST
   NOT be interpreted as a signature."

Also, our lilypond files are text, so I would understand this that we
should treat the U+FEFF inside the file contents as normal whitespace.


Description:
Issue 905: Gracefully ignore UTF-8 BOM in the middle of a file

Please review this at http://codereview.appspot.com/4908043/

Affected files:
  A input/regression/bom-mark.ly
  M lily/include/lily-lexer.hh
  M lily/lexer.ll
  M lily/lily-lexer.cc


Index: input/regression/bom-mark.ly
diff --git a/input/regression/bom-mark.ly b/input/regression/bom-mark.ly
new file mode 100644
index  
..19895a5af8151d00f7656ea5e51df0d214cd5b5d

--- /dev/null
+++ b/input/regression/bom-mark.ly
@@ -0,0 +1,11 @@
+ \version "2.15.9"
+
+#(ly:set-option 'warning-as-error #f)
+
+\header {
+  texidoc = "This input file contains a UTF-8 BOM not at the very  
beginning,

+  but on the first line after the first byte. LilyPond should gracefully
+  ignore this BOM as specified in RFC 3629, but print a warning."
+}
+
+{ c }
Index: lily/include/lily-lexer.hh
diff --git a/lily/include/lily-lexer.hh b/lily/include/lily-lexer.hh
index  
72391a087748cdd676739a8ed2b3646547f077c7..9729ca701664d8cbaa28277408e62c6cc1e434aa  
100644

--- a/lily/include/lily-lexer.hh
+++ b/lily/include/lily-lexer.hh
@@ -110,6 +110,7 @@ public:
   void push_note_state (SCM tab);
   void pop_state ();
   void LexerError (char const *);
+  void LexerWarning (char const *);
   void set_identifier (SCM path, SCM val);
   int get_state () const;
   bool is_note_state () const;
Index: lily/lexer.ll
diff --git a/lily/lexer.ll b/lily/lexer.ll
index  
7cda144e263c9720868330a988904f7fd45dee89..9cb706ebdcaf2f04f4ef32526779aa636d597da1  
100644

--- a/lily/lexer.ll
+++ b/lily/lexer.ll
@@ -189,8 +189,8 @@ BOM_UTF8\357\273\277
 {BOM_UTF8}/.* {
   if (this->lexloc_->line_number () != 1 || this->lexloc_->column_number  
() != 0)

 {
-  LexerError (_ ("stray UTF-8 BOM encountered").c_str ());
-  exit (1);
+  LexerWarning (_ ("stray UTF-8 BOM encountered").c_str ());
+  // exit (1);
 }
   debug_output (_ ("Skipping UTF-8 BOM"));
 }
Index: lily/lily-lexer.cc
diff --git a/lily/lily-lexer.cc b/lily/lily-lexer.cc
index  
5d87c83872d25052496f800de539760a71264c69..ba6429c3ea2798344702178363f200071c0f73cc  
100644

--- a/lily/lily-lexer.cc
+++ b/lily/lily-lexer.cc
@@ -310,7 +310,7 @@ void
 Lily_lexer::LexerError (char const *s)
 {
   if (include_stack_.empty ())
-message (_f ("error at EOF: %s", s) + "\n");
+non_fatal_error (s, _f ("%s:EOF", s));
   else
 {
   error_level_ |= 1;
@@ -319,6 +319,18 @@ Lily_lexer::LexerError (char const *s)
 }
 }

+void
+Lily_lexer::LexerWarning (char const *s)
+{
+  if (include_stack_.empty ())
+warning (s, _f ("%s:EOF", s));
+  else
+{
+  Input spot (*lexloc_);
+  spot.warning (s);
+}
+}
+
 char
 Lily_lexer::escaped_char (char c) const
 {


___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Issue 905: Gracefully ignore UTF-8 BOM in the middle of a file (issue 4908043)

2011-08-15 Thread lemzwerg

Could you please tell me what this patch is good for?  A BOM not at the
beginning of a file is no longer a BOM...

I don't oppose to emitting a warning if U+FEFF is encountered, and we
subsequently ignore it (since its use as zero width no-break space is
deprecated), but only within strings...

What am I missing?

http://codereview.appspot.com/4908043/

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel