Re: [Pharo-project] Invalid utf8 input detected: now what?
On 23.07.2010 15:15, Schwab,Wilhelm K wrote: No dialogs, please :) Actually, it would be fine if there were a different stream class or simply a different method/state (encoding =#userInteraction or something??) that is understood to negotiate such details with the user. In general, exception is the correct way to handle this: the stream knows what is wrong; the application will know what to make of it. If the encoding can be detected automatically, that would be great. There is no way to do this. The only thing that can be determined is that something is not utf-8. The stream did that reliably. But you said you already know the encoding, so just set in on the stream and it should work. Firefox tells me that the encoding is ISO-8859-1; ISO-8859-1 and ISO-8859-15 are not the same. Cheers Philippe ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Re: [Pharo-project] Invalid utf8 input detected: now what?
I'm ok with calling this works as intended if the encoding experts are. Since I am *not* an expert on encoding, I ran it up the flag pole. From: pharo-project-boun...@lists.gforge.inria.fr [pharo-project-boun...@lists.gforge.inria.fr] On Behalf Of Philippe Marschall [kus...@gmx.net] Sent: Sunday, July 25, 2010 3:57 AM To: pharo-project@lists.gforge.inria.fr Subject: Re: [Pharo-project] Invalid utf8 input detected: now what? On 23.07.2010 15:15, Schwab,Wilhelm K wrote: No dialogs, please :) Actually, it would be fine if there were a different stream class or simply a different method/state (encoding =#userInteraction or something??) that is understood to negotiate such details with the user. In general, exception is the correct way to handle this: the stream knows what is wrong; the application will know what to make of it. If the encoding can be detected automatically, that would be great. There is no way to do this. The only thing that can be determined is that something is not utf-8. The stream did that reliably. But you said you already know the encoding, so just set in on the stream and it should work. Firefox tells me that the encoding is ISO-8859-1; ISO-8859-1 and ISO-8859-15 are not the same. Cheers Philippe ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Re: [Pharo-project] Invalid utf8 input detected: now what?
On Jul 24, 2010, at 1:57 11PM, Schwab,Wilhelm K wrote: I agree that there is apparently not much of a problem. However, I also stand by no more dialogs unless they are in a clearly-identified class/method/state that is known to interact with the user. Squeak has *far* too much forced and unexpected interaction, and we must not go back down that road. Which is why I asked initially whether you encountered this when using a tool (ie file browser etc.), or custom code. For tools when you have no way to set one encoding which will be correct for all cases, it might be a better behaviour to open a dialogue where one can be selected instead of raising a DNU, if a GUI is present. Those things said, there might be room to grow, as someone suggested the possibility of automatically detecting the coding. Then they were wrong. Cheers, Henry ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Re: [Pharo-project] Invalid utf8 input detected: now what?
On Jul 23, 2010, at 4:38 AM, Yanni Chiu wrote: Schwab,Wilhelm K wrote: I got an error (on Ubuntu 9.10) trying open an old text file that I created on Windows some time ago. The encoding (if gedit's save-as dialog can be trusted??) is Western ISO-8859-15; resaving as utf8 lets me read it. You could try viewing the original file in a web browser. Try different encodings until the stuff looks right. Then you might have a better idea of whether you really have a file in ISO-8859-15. You could also view your converted UTF-8 file in a web browser too, and compare the two renderings. If this checks out, then maybe it's a Pharo issue. please report and if possible with a test so that we can fix it. ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Re: [Pharo-project] Invalid utf8 input detected: now what?
On 07/23/2010 04:09 AM, Schwab,Wilhelm K wrote: Hello all, I got an error (on Ubuntu 9.10) trying open an old text file that I created on Windows some time ago. The encoding (if gedit's save-as dialog can be trusted??) is Western ISO-8859-15; resaving as utf8 lets me read it. So, is Pharo working by design? Did I do the correct/only thing needed to read the file? You need to pass the encoding to the file stream. Cheers Philippe ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Re: [Pharo-project] Invalid utf8 input detected: now what?
No dialogs, please :) Actually, it would be fine if there were a different stream class or simply a different method/state (encoding =#userInteraction or something??) that is understood to negotiate such details with the user. In general, exception is the correct way to handle this: the stream knows what is wrong; the application will know what to make of it. If the encoding can be detected automatically, that would be great. Firefox tells me that the encoding is ISO-8859-1; I am not leaving off the 5, Firefox and gedit report it differently. In fairness to gedit, I am reporting the encodings listed in its save-as dialog. Unfortunately the offending file contains specifications that are not mine. I have seen pieces of it published elsewhere (quite recently in fact) but will need to do some checking on the licensing. I might be able to excerpt the file and end up with the same behavior. Bill From: pharo-project-boun...@lists.gforge.inria.fr [pharo-project-boun...@lists.gforge.inria.fr] On Behalf Of Henrik Johansen [henrik.s.johan...@veloxit.no] Sent: Friday, July 23, 2010 7:39 AM To: Pharo-project@lists.gforge.inria.fr Subject: Re: [Pharo-project] Invalid utf8 input detected: now what? On Jul 23, 2010, at 4:09 30AM, Schwab,Wilhelm K wrote: Hello all, I got an error (on Ubuntu 9.10) trying open an old text file that I created on Windows some time ago. The encoding (if gedit's save-as dialog can be trusted??) is Western ISO-8859-15; resaving as utf8 lets me read it. So, is Pharo working by design? Did I do the correct/only thing needed to read the file? What should I be asking? Is there anything I can do to turn this into a useful test/debugging example? Bill This is not an error per se, seeing as the encoding is not utf8 :) If the import was done from some tool instead of in your code (in which case you'd set the encoding of the file stream), a nicer *behavior* might be for the UI Manager to catch encoding errors when trying to read a file, and offer up a dialogue with a list of encodings which the file *can* be read as, along with a preview window of what the text would look like with the selected encoding, like some word processors do. Cheers, Henry ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Re: [Pharo-project] Invalid utf8 input detected: now what?
Den 23. juli 2010 kl. 15:15 skrev Schwab,Wilhelm K bsch...@anest.ufl.edu: No dialogs, please :) Actually, it would be fine if there were a different stream class or simply a different method/state (encoding =#userInteraction or something??) that is understood to negotiate such details with the user. I have no idea what you are suggesting... In general, exception is the correct way to handle this: the stream knows what is wrong; the application will know what to make of it. If the encoding can be detected automatically, that would be great. Then I fail to see what the problem is. You got an error stating it was not UTF8, which implies a choice of the correct encoding needs to be done by the application. (by setting the streams encoding to something else) As you noticed with gedit/firefox, any automatic detection is at best an educated guess, and can not be relied upon to make the correct choice. Cheers, Henry ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Re: [Pharo-project] Invalid utf8 input detected: now what?
Schwab,Wilhelm K wrote: I got an error (on Ubuntu 9.10) trying open an old text file that I created on Windows some time ago. The encoding (if gedit's save-as dialog can be trusted??) is Western ISO-8859-15; resaving as utf8 lets me read it. You could try viewing the original file in a web browser. Try different encodings until the stuff looks right. Then you might have a better idea of whether you really have a file in ISO-8859-15. You could also view your converted UTF-8 file in a web browser too, and compare the two renderings. If this checks out, then maybe it's a Pharo issue. ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Re: [Pharo-project] invalid utf8 input detected
I did the following (Object#doesNotUNderstand) getSourceFromFile and I get an invalid Now when I take another method (BalloonFontTest#testDefaultFont) I do not get problem. I will reread carefully the mails of nicolas to try to understand, I do not know if the fixes of yoh http://bugs.squeak.org/view.php?id=5996 is related. Nicolas {Object#doesNotUnderstand:. SystemNavigation#browseMethodsWhoseNamesContain:. Utilities class#changeStampPerSe. Utilities class#methodsWithInitials:} collect: [:e | (e getSourceFromFile select: [:s | s charCode 127]) asArray collect: [:c | c charCode]] I cannot get that code running it break before with me. Stef ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Re: [Pharo-project] invalid utf8 input detected
Hi, I attached here a DNU implementation I took from an older image. After filing this one in, I can debug DNU problems. Cheers, Doru Object-doesNotUnderstand.st Description: Binary data On 23 May 2009, at 13:04, Stéphane Ducasse wrote: I did the following (Object#doesNotUNderstand) getSourceFromFile and I get an invalid Now when I take another method (BalloonFontTest#testDefaultFont) I do not get problem. I will reread carefully the mails of nicolas to try to understand, I do not know if the fixes of yoh http://bugs.squeak.org/view.php?id=5996 is related. Nicolas {Object#doesNotUnderstand:. SystemNavigation#browseMethodsWhoseNamesContain:. Utilities class#changeStampPerSe. Utilities class#methodsWithInitials:} collect: [:e | (e getSourceFromFile select: [:s | s charCode 127]) asArray collect: [:c | c charCode]] I cannot get that code running it break before with me. Stef ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project -- www.tudorgirba.com Not knowing how to do something is not an argument for how it cannot be done. ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Re: [Pharo-project] invalid utf8 input detected
Actually, the fix is even simpler: if you find a method that raises invalid utf8 input detected, just browse to it with a class browser, and re-accept it :). With my previous mail, I was not implying that someone should fix it for me, I was merely asking for what could a quick solution be, because I was a bit lost (scared) :). Now, I am happy. Thanks for discussing it. Cheers, Doru On 23 May 2009, at 13:07, Tudor Girba wrote: Hi, I attached here a DNU implementation I took from an older image. After filing this one in, I can debug DNU problems. Cheers, Doru Object-doesNotUnderstand.st On 23 May 2009, at 13:04, Stéphane Ducasse wrote: I did the following (Object#doesNotUNderstand) getSourceFromFile and I get an invalid Now when I take another method (BalloonFontTest#testDefaultFont) I do not get problem. I will reread carefully the mails of nicolas to try to understand, I do not know if the fixes of yoh http://bugs.squeak.org/view.php?id=5996 is related. Nicolas {Object#doesNotUnderstand:. SystemNavigation#browseMethodsWhoseNamesContain:. Utilities class#changeStampPerSe. Utilities class#methodsWithInitials:} collect: [:e | (e getSourceFromFile select: [:s | s charCode 127]) asArray collect: [:c | c charCode]] I cannot get that code running it break before with me. Stef ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project -- www.tudorgirba.com Not knowing how to do something is not an argument for how it cannot be done. ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project -- www.tudorgirba.com Problem solving efficiency grows with the abstractness level of problem understanding. ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Re: [Pharo-project] invalid utf8 input detected
No problem I never interpreted it like that. Me too I want a system that is working Adrian I will publish a fix for DNU now and I will try later to check the fixes proposed by yoshiki stef On May 23, 2009, at 1:29 PM, Tudor Girba wrote: Actually, the fix is even simpler: if you find a method that raises invalid utf8 input detected, just browse to it with a class browser, and re-accept it :). With my previous mail, I was not implying that someone should fix it for me, I was merely asking for what could a quick solution be, because I was a bit lost (scared) :). Now, I am happy. Thanks for discussing it. Cheers, Doru On 23 May 2009, at 13:07, Tudor Girba wrote: Hi, I attached here a DNU implementation I took from an older image. After filing this one in, I can debug DNU problems. Cheers, Doru Object-doesNotUnderstand.st On 23 May 2009, at 13:04, Stéphane Ducasse wrote: I did the following (Object#doesNotUNderstand) getSourceFromFile and I get an invalid Now when I take another method (BalloonFontTest#testDefaultFont) I do not get problem. I will reread carefully the mails of nicolas to try to understand, I do not know if the fixes of yoh http://bugs.squeak.org/view.php?id=5996 is related. Nicolas {Object#doesNotUnderstand:. SystemNavigation#browseMethodsWhoseNamesContain:. Utilities class#changeStampPerSe. Utilities class#methodsWithInitials:} collect: [:e | (e getSourceFromFile select: [:s | s charCode 127]) asArray collect: [:c | c charCode]] I cannot get that code running it break before with me. Stef ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project -- www.tudorgirba.com Not knowing how to do something is not an argument for how it cannot be done. ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project -- www.tudorgirba.com Problem solving efficiency grows with the abstractness level of problem understanding. ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Re: [Pharo-project] invalid utf8 input detected
What happened exactly is very hard to trace because these FileStream are a can of worms... Here are some of my perigrinations: FIRST POSSIBLE TRACK: All methods were changed in 10305. Monticello snapshot/source.st is not UTF-8. If the file is opened UTF-8, then we get decompiledCode, I don't know why yet... But the changes still go into the change log in correct UTF-8 form, so that's just another bug, but not the real source of the problem. For getting some worms out of the can just browse inst var defs of converter in MultiByteFileStream: The accessor #converter initialize converter with TextConverter defaultSystemConverter which depends on LanguageEnvironment. That is a Latin1TextConverter in my latin image. Unless #reset is called first, in which case it will initialize with a UTF8TextConverter. Yes, but open: fileName forWrite: writeMode, does the job too with a UTF8TextConverter. You still follow? me neither. A better behaved is #setConverterForCode that should let non UTF-8 .mcz work in UTF-8 environment, but not sure if called where required... I think Yoshiki changes are necessary only for writing source code with character code 255. This was not the case of incriminated methods. SECOND POSSIBLE TRACK: Everything going to the change log pass thru the MultiByteFileStream, so how did non UTF-8 characters went in? I tried to follow two other clues: 1) There are senders of #primWrite:from:startingAt:count: not redefined in MultiByteFileStream... for example, using #next:putAll:startingAt: will bypass the converter. 2) using nextPutAll: with a ByteArray argument also does bypass the converter (See MultiByteFileStream#nextPutAll:) I did not find the senders (you really believe senders of nextPutAll: can be analyzed?). I tried to instrument code with Notification, but I'm unable to reproduce the problem, so that was vain... THIRD POSSIBLE TRACK: http://gforge.inria.fr/frs/download.php/22283/Pharo0.1Core-10304cl.zip has the invalid UTF-8 problem, just before 10305 changes that introduced decompiled code... So we might attack the problem with another code snippet: (SystemNavigation default browseAllCallsOn: (Smalltalk associationAt: #SourceFiles))... Hmm, I might have a better clue now. The problem might possibly come from the condenseChanges in update10298. What happen in a condenseChanges? Changes are copied to this file: f := FileStream fileNamed: 'ST80.temp'. So far, so good, because the concreteStream is a MultiByteFileStream. But the end finishes with: SourceFiles at: 2 put: (StandardFileStream oldFileNamed: oldChanges name) Waouh, no MultiByteFileStream here, so no more UTF-8. But hey, that would be the inverse problem: reading UTF-8 text with latin1 reader: I can't get an error doing this, only some strange sequence of characters... (The UTF-8 encoding)... Unless incriminated methods are further changed in #script376 or any other method... In which case they are written in latin1 in the changeLog... Hmm... That could be the case eventually. We must restart update process from http://gforge.inria.fr/frs/download.php/22167/Pharo0.1Core-10296cl-2.zip One thing is sure, at next returnFromSnapshot, FileDirectory classstartup will reopen changes UTF-8. So saving the image will reopen UTF-8... But wait... Maybe we get enough pieces of the puzzle: Analyzing the Pharo0.1Core-10304cl.changes tells that Stephane applied several updates before snapshoting the image. So if Kernel and System-Support are changed between 10298 and 10304, then we get the explanation: - condense changes put all in the .changes in UTF-8 but reopen the changes in latin1 - further updates up to 10304 write changes in latin1 - image snapshot reopen changes in UTF-8 and thus we get further invalid UTF-8... That's easy to reproduce. Stef, can you confirm? That also explain why I did not get the problem at home: I update early and always save my image after. After that we still have to detect and clean while Monticello sources are interpreted UTF-8 when they should not (FIRST TRACK) , and eventually make source code go UTF-8 in Monticello, so that non latin programmers can use their favourite language eventually... Nicolas 2009/5/23 Stéphane Ducasse stephane.duca...@inria.fr: No problem I never interpreted it like that. Me too I want a system that is working Adrian I will publish a fix for DNU now and I will try later to check the fixes proposed by yoshiki stef On May 23, 2009, at 1:29 PM, Tudor Girba wrote: Actually, the fix is even simpler: if you find a method that raises invalid utf8 input detected, just browse to it with a class browser, and re-accept it :). With my previous mail, I was not implying that someone should fix it for me, I was merely asking for what could a quick solution be, because I was a bit lost (scared) :). Now, I am happy. Thanks for discussing it. Cheers, Doru On 23 May 2009, at 13:07, Tudor Girba wrote: Hi, I attached here a
Re: [Pharo-project] invalid utf8 input detected
HI nicolas I was reading the changes of yoshiki I will integrate but indeed this is not for our case. My reply below... I tried to follow :) What happened exactly is very hard to trace because these FileStream are a can of worms... Here are some of my perigrinations: FIRST POSSIBLE TRACK: All methods were changed in 10305. Monticello snapshot/source.st is not UTF-8. If the file is opened UTF-8, then we get decompiledCode, I don't know why yet... But the changes still go into the change log in correct UTF-8 form, so that's just another bug, but not the real source of the problem. For getting some worms out of the can just browse inst var defs of converter in MultiByteFileStream: The accessor #converter initialize converter with TextConverter defaultSystemConverter which depends on LanguageEnvironment. That is a Latin1TextConverter in my latin image. Unless #reset is called first, in which case it will initialize with a UTF8TextConverter. Yes, but open: fileName forWrite: writeMode, does the job too with a UTF8TextConverter. You still follow? me neither. A better behaved is #setConverterForCode that should let non UTF-8 .mcz work in UTF-8 environment, but not sure if called where required... I think Yoshiki changes are necessary only for writing source code with character code 255. This was not the case of incriminated methods. SECOND POSSIBLE TRACK: Everything going to the change log pass thru the MultiByteFileStream, so how did non UTF-8 characters went in? I tried to follow two other clues: 1) There are senders of #primWrite:from:startingAt:count: not redefined in MultiByteFileStream... for example, using #next:putAll:startingAt: will bypass the converter. 2) using nextPutAll: with a ByteArray argument also does bypass the converter (See MultiByteFileStream#nextPutAll:) I did not find the senders (you really believe senders of nextPutAll: can be analyzed?). I tried to instrument code with Notification, but I'm unable to reproduce the problem, so that was vain... THIRD POSSIBLE TRACK: http://gforge.inria.fr/frs/download.php/22283/Pharo0.1Core-10304cl.zip has the invalid UTF-8 problem, just before 10305 changes that introduced decompiled code... So we might attack the problem with another code snippet: (SystemNavigation default browseAllCallsOn: (Smalltalk associationAt: #SourceFiles))... Hmm, I might have a better clue now. The problem might possibly come from the condenseChanges in update10298. What happen in a condenseChanges? Changes are copied to this file: f := FileStream fileNamed: 'ST80.temp'. So far, so good, because the concreteStream is a MultiByteFileStream. But the end finishes with: SourceFiles at: 2 put: (StandardFileStream oldFileNamed: oldChanges name) Waouh, no MultiByteFileStream here, so no more UTF-8. But hey, that would be the inverse problem: reading UTF-8 text with latin1 reader: I can't get an error doing this, only some strange sequence of characters... (The UTF-8 encoding)... Unless incriminated methods are further changed in #script376 or any other method... In which case they are written in latin1 in the changeLog... Hmm... That could be the case eventually. We must restart update process from http://gforge.inria.fr/frs/download.php/22167/Pharo0.1Core-10296cl-2.zip One thing is sure, at next returnFromSnapshot, FileDirectory classstartup will reopen changes UTF-8. So saving the image will reopen UTF-8... But wait... Maybe we get enough pieces of the puzzle: Analyzing the Pharo0.1Core-10304cl.changes tells that Stephane applied several updates before snapshoting the image. So if Kernel and System-Support are changed between 10298 and 10304, then we get the explanation: - condense changes put all in the .changes in UTF-8 but reopen the changes in latin1 - further updates up to 10304 write changes in latin1 - image snapshot reopen changes in UTF-8 and thus we get further invalid UTF-8... That's easy to reproduce. Stef, can you confirm? how do you want me to confirm? That I redo the image. What we can do is change the update method to block the update at a certain number. That also explain why I did not get the problem at home: I update early and always save my image after. After that we still have to detect and clean while Monticello sources are interpreted UTF-8 when they should not (FIRST TRACK) , and eventually make source code go UTF-8 in Monticello, so that non latin programmers can use their favourite language eventually... Nicolas 2009/5/23 Stéphane Ducasse stephane.duca...@inria.fr: No problem I never interpreted it like that. Me too I want a system that is working Adrian I will publish a fix for DNU now and I will try later to check the fixes proposed by yoshiki stef On May 23, 2009, at 1:29 PM, Tudor Girba wrote: Actually, the fix is even simpler: if you find a method that raises invalid utf8 input
Re: [Pharo-project] invalid utf8 input detected
I confirm the scenario: 1) update10298 condenseChanges that let (SourceFiles at: 2) class = StandardFileStream This is the seed of further problems, because further changes will be encoded in latin1 (or MacRoman I don't really wnt to know) 2) update10302 changes the methods with non ASCII characters 3) Stef save the image after update10304, that does reopen (SourceFiles at: 2) in UTF-8, but that's too late, the worm is in the apple. If you save the image just after the condenseChanges, no problem because (SourceFiles at: 2) is opened in Latin1 AFTER all the changes have gotten into it, and reopened UTF-8 before any changes got into it. We must track undue usage of StandardFileStream such as #condenseChanges. 2009/5/23 Nicolas Cellier nicolas.cellier.aka.n...@gmail.com: What happened exactly is very hard to trace because these FileStream are a can of worms... Here are some of my perigrinations: FIRST POSSIBLE TRACK: All methods were changed in 10305. Monticello snapshot/source.st is not UTF-8. If the file is opened UTF-8, then we get decompiledCode, I don't know why yet... But the changes still go into the change log in correct UTF-8 form, so that's just another bug, but not the real source of the problem. For getting some worms out of the can just browse inst var defs of converter in MultiByteFileStream: The accessor #converter initialize converter with TextConverter defaultSystemConverter which depends on LanguageEnvironment. That is a Latin1TextConverter in my latin image. Unless #reset is called first, in which case it will initialize with a UTF8TextConverter. Yes, but open: fileName forWrite: writeMode, does the job too with a UTF8TextConverter. You still follow? me neither. A better behaved is #setConverterForCode that should let non UTF-8 .mcz work in UTF-8 environment, but not sure if called where required... I think Yoshiki changes are necessary only for writing source code with character code 255. This was not the case of incriminated methods. SECOND POSSIBLE TRACK: Everything going to the change log pass thru the MultiByteFileStream, so how did non UTF-8 characters went in? I tried to follow two other clues: 1) There are senders of #primWrite:from:startingAt:count: not redefined in MultiByteFileStream... for example, using #next:putAll:startingAt: will bypass the converter. 2) using nextPutAll: with a ByteArray argument also does bypass the converter (See MultiByteFileStream#nextPutAll:) I did not find the senders (you really believe senders of nextPutAll: can be analyzed?). I tried to instrument code with Notification, but I'm unable to reproduce the problem, so that was vain... THIRD POSSIBLE TRACK: http://gforge.inria.fr/frs/download.php/22283/Pharo0.1Core-10304cl.zip has the invalid UTF-8 problem, just before 10305 changes that introduced decompiled code... So we might attack the problem with another code snippet: (SystemNavigation default browseAllCallsOn: (Smalltalk associationAt: #SourceFiles))... Hmm, I might have a better clue now. The problem might possibly come from the condenseChanges in update10298. What happen in a condenseChanges? Changes are copied to this file: f := FileStream fileNamed: 'ST80.temp'. So far, so good, because the concreteStream is a MultiByteFileStream. But the end finishes with: SourceFiles at: 2 put: (StandardFileStream oldFileNamed: oldChanges name) Waouh, no MultiByteFileStream here, so no more UTF-8. But hey, that would be the inverse problem: reading UTF-8 text with latin1 reader: I can't get an error doing this, only some strange sequence of characters... (The UTF-8 encoding)... Unless incriminated methods are further changed in #script376 or any other method... In which case they are written in latin1 in the changeLog... Hmm... That could be the case eventually. We must restart update process from http://gforge.inria.fr/frs/download.php/22167/Pharo0.1Core-10296cl-2.zip One thing is sure, at next returnFromSnapshot, FileDirectory classstartup will reopen changes UTF-8. So saving the image will reopen UTF-8... But wait... Maybe we get enough pieces of the puzzle: Analyzing the Pharo0.1Core-10304cl.changes tells that Stephane applied several updates before snapshoting the image. So if Kernel and System-Support are changed between 10298 and 10304, then we get the explanation: - condense changes put all in the .changes in UTF-8 but reopen the changes in latin1 - further updates up to 10304 write changes in latin1 - image snapshot reopen changes in UTF-8 and thus we get further invalid UTF-8... That's easy to reproduce. Stef, can you confirm? That also explain why I did not get the problem at home: I update early and always save my image after. After that we still have to detect and clean while Monticello sources are interpreted UTF-8 when they should not (FIRST TRACK) , and eventually make source code go UTF-8 in
Re: [Pharo-project] invalid utf8 input detected
On May 23, 2009, at 7:57 PM, Nicolas Cellier wrote: I confirm the scenario: 1) update10298 condenseChanges that let (SourceFiles at: 2) class = StandardFileStream This is the seed of further problems, because further changes will be encoded in latin1 (or MacRoman I don't really wnt to know) 2) update10302 changes the methods with non ASCII characters 3) Stef save the image after update10304, that does reopen (SourceFiles at: 2) in UTF-8, but that's too late, the worm is in the apple. If you save the image just after the condenseChanges, no problem because (SourceFiles at: 2) is opened in Latin1 AFTER all the changes have gotten into it, and reopened UTF-8 before any changes got into it. We must track undue usage of StandardFileStream such as #condenseChanges. Ok now we cannot really rollback the changes and I fixed the methods that were leading to invalid UTF. But it means that we should check the StandardFileStream usage. Im doing some experiences with umejava code Stef 2009/5/23 Nicolas Cellier nicolas.cellier.aka.n...@gmail.com: What happened exactly is very hard to trace because these FileStream are a can of worms... Here are some of my perigrinations: FIRST POSSIBLE TRACK: All methods were changed in 10305. Monticello snapshot/source.st is not UTF-8. If the file is opened UTF-8, then we get decompiledCode, I don't know why yet... But the changes still go into the change log in correct UTF-8 form, so that's just another bug, but not the real source of the problem. For getting some worms out of the can just browse inst var defs of converter in MultiByteFileStream: The accessor #converter initialize converter with TextConverter defaultSystemConverter which depends on LanguageEnvironment. That is a Latin1TextConverter in my latin image. Unless #reset is called first, in which case it will initialize with a UTF8TextConverter. Yes, but open: fileName forWrite: writeMode, does the job too with a UTF8TextConverter. You still follow? me neither. A better behaved is #setConverterForCode that should let non UTF-8 .mcz work in UTF-8 environment, but not sure if called where required... I think Yoshiki changes are necessary only for writing source code with character code 255. This was not the case of incriminated methods. SECOND POSSIBLE TRACK: Everything going to the change log pass thru the MultiByteFileStream, so how did non UTF-8 characters went in? I tried to follow two other clues: 1) There are senders of #primWrite:from:startingAt:count: not redefined in MultiByteFileStream... for example, using #next:putAll:startingAt: will bypass the converter. 2) using nextPutAll: with a ByteArray argument also does bypass the converter (See MultiByteFileStream#nextPutAll:) I did not find the senders (you really believe senders of nextPutAll: can be analyzed?). I tried to instrument code with Notification, but I'm unable to reproduce the problem, so that was vain... THIRD POSSIBLE TRACK: http://gforge.inria.fr/frs/download.php/22283/ Pharo0.1Core-10304cl.zip has the invalid UTF-8 problem, just before 10305 changes that introduced decompiled code... So we might attack the problem with another code snippet: (SystemNavigation default browseAllCallsOn: (Smalltalk associationAt: #SourceFiles))... Hmm, I might have a better clue now. The problem might possibly come from the condenseChanges in update10298. What happen in a condenseChanges? Changes are copied to this file: f := FileStream fileNamed: 'ST80.temp'. So far, so good, because the concreteStream is a MultiByteFileStream. But the end finishes with: SourceFiles at: 2 put: (StandardFileStream oldFileNamed: oldChanges name) Waouh, no MultiByteFileStream here, so no more UTF-8. But hey, that would be the inverse problem: reading UTF-8 text with latin1 reader: I can't get an error doing this, only some strange sequence of characters... (The UTF-8 encoding)... Unless incriminated methods are further changed in #script376 or any other method... In which case they are written in latin1 in the changeLog... Hmm... That could be the case eventually. We must restart update process from http://gforge.inria.fr/frs/download.php/22167/Pharo0.1Core-10296cl-2.zip One thing is sure, at next returnFromSnapshot, FileDirectory classstartup will reopen changes UTF-8. So saving the image will reopen UTF-8... But wait... Maybe we get enough pieces of the puzzle: Analyzing the Pharo0.1Core-10304cl.changes tells that Stephane applied several updates before snapshoting the image. So if Kernel and System-Support are changed between 10298 and 10304, then we get the explanation: - condense changes put all in the .changes in UTF-8 but reopen the changes in latin1 - further updates up to 10304 write changes in latin1 - image snapshot reopen changes in UTF-8 and thus we get further invalid UTF-8... That's easy to reproduce.
Re: [Pharo-project] invalid utf8 input detected
Wow, great analysis, Nicolas! I was trying to find the cause for several hours now. Your third track exactly matches my findings. For example in Object#doesNotUnderstand: prior to the condensing, the source contained a non-ASCII character (UTF8 encoded as the two bytes: 192 160). This gets correctly transferred during the condensing into the new changes file. When you don't save the image (and hence have the standard stream without UTF8 encoder) what you see in the source is the character  (this is 192). That is, we suddenly have two characters, 192 and 160 where before there was just one. If you load a package, MC will compare methods and think this is a change. When loading the method from the MC file, the source is UTF8 encoded, producing a unicode character 160. When storing this source to the file (still without the encoder), it will just directly put 160 there. At this point we have lost the leading byte 192. Next time we start or save the image and have the right encoder again, it will choke because 160 is an invalid first byte in UTF8. I think it's save to fix the invalid methods by overriding their source. So we don't have to backtrack to version 10297. Thanks, Adrian On May 23, 2009, at 19:57 , Nicolas Cellier wrote: I confirm the scenario: 1) update10298 condenseChanges that let (SourceFiles at: 2) class = StandardFileStream This is the seed of further problems, because further changes will be encoded in latin1 (or MacRoman I don't really wnt to know) 2) update10302 changes the methods with non ASCII characters 3) Stef save the image after update10304, that does reopen (SourceFiles at: 2) in UTF-8, but that's too late, the worm is in the apple. If you save the image just after the condenseChanges, no problem because (SourceFiles at: 2) is opened in Latin1 AFTER all the changes have gotten into it, and reopened UTF-8 before any changes got into it. We must track undue usage of StandardFileStream such as #condenseChanges. 2009/5/23 Nicolas Cellier nicolas.cellier.aka.n...@gmail.com: What happened exactly is very hard to trace because these FileStream are a can of worms... Here are some of my perigrinations: FIRST POSSIBLE TRACK: All methods were changed in 10305. Monticello snapshot/source.st is not UTF-8. If the file is opened UTF-8, then we get decompiledCode, I don't know why yet... But the changes still go into the change log in correct UTF-8 form, so that's just another bug, but not the real source of the problem. For getting some worms out of the can just browse inst var defs of converter in MultiByteFileStream: The accessor #converter initialize converter with TextConverter defaultSystemConverter which depends on LanguageEnvironment. That is a Latin1TextConverter in my latin image. Unless #reset is called first, in which case it will initialize with a UTF8TextConverter. Yes, but open: fileName forWrite: writeMode, does the job too with a UTF8TextConverter. You still follow? me neither. A better behaved is #setConverterForCode that should let non UTF-8 .mcz work in UTF-8 environment, but not sure if called where required... I think Yoshiki changes are necessary only for writing source code with character code 255. This was not the case of incriminated methods. SECOND POSSIBLE TRACK: Everything going to the change log pass thru the MultiByteFileStream, so how did non UTF-8 characters went in? I tried to follow two other clues: 1) There are senders of #primWrite:from:startingAt:count: not redefined in MultiByteFileStream... for example, using #next:putAll:startingAt: will bypass the converter. 2) using nextPutAll: with a ByteArray argument also does bypass the converter (See MultiByteFileStream#nextPutAll:) I did not find the senders (you really believe senders of nextPutAll: can be analyzed?). I tried to instrument code with Notification, but I'm unable to reproduce the problem, so that was vain... THIRD POSSIBLE TRACK: http://gforge.inria.fr/frs/download.php/22283/ Pharo0.1Core-10304cl.zip has the invalid UTF-8 problem, just before 10305 changes that introduced decompiled code... So we might attack the problem with another code snippet: (SystemNavigation default browseAllCallsOn: (Smalltalk associationAt: #SourceFiles))... Hmm, I might have a better clue now. The problem might possibly come from the condenseChanges in update10298. What happen in a condenseChanges? Changes are copied to this file: f := FileStream fileNamed: 'ST80.temp'. So far, so good, because the concreteStream is a MultiByteFileStream. But the end finishes with: SourceFiles at: 2 put: (StandardFileStream oldFileNamed: oldChanges name) Waouh, no MultiByteFileStream here, so no more UTF-8. But hey, that would be the inverse problem: reading UTF-8 text with latin1 reader: I can't get an error doing this, only some strange sequence of
Re: [Pharo-project] invalid utf8 input detected
Excellent! Thanks guys. I'm preparing a lectures for torino and I will experiment with umejava mcz fixes. Stef On May 23, 2009, at 8:49 PM, Adrian Lienhard wrote: Wow, great analysis, Nicolas! I was trying to find the cause for several hours now. Your third track exactly matches my findings. For example in Object#doesNotUnderstand: prior to the condensing, the source contained a non-ASCII character (UTF8 encoded as the two bytes: 192 160). This gets correctly transferred during the condensing into the new changes file. When you don't save the image (and hence have the standard stream without UTF8 encoder) what you see in the source is the character  (this is 192). That is, we suddenly have two characters, 192 and 160 where before there was just one. If you load a package, MC will compare methods and think this is a change. When loading the method from the MC file, the source is UTF8 encoded, producing a unicode character 160. When storing this source to the file (still without the encoder), it will just directly put 160 there. At this point we have lost the leading byte 192. Next time we start or save the image and have the right encoder again, it will choke because 160 is an invalid first byte in UTF8. I think it's save to fix the invalid methods by overriding their source. So we don't have to backtrack to version 10297. Thanks, Adrian On May 23, 2009, at 19:57 , Nicolas Cellier wrote: I confirm the scenario: 1) update10298 condenseChanges that let (SourceFiles at: 2) class = StandardFileStream This is the seed of further problems, because further changes will be encoded in latin1 (or MacRoman I don't really wnt to know) 2) update10302 changes the methods with non ASCII characters 3) Stef save the image after update10304, that does reopen (SourceFiles at: 2) in UTF-8, but that's too late, the worm is in the apple. If you save the image just after the condenseChanges, no problem because (SourceFiles at: 2) is opened in Latin1 AFTER all the changes have gotten into it, and reopened UTF-8 before any changes got into it. We must track undue usage of StandardFileStream such as #condenseChanges. 2009/5/23 Nicolas Cellier nicolas.cellier.aka.n...@gmail.com: What happened exactly is very hard to trace because these FileStream are a can of worms... Here are some of my perigrinations: FIRST POSSIBLE TRACK: All methods were changed in 10305. Monticello snapshot/source.st is not UTF-8. If the file is opened UTF-8, then we get decompiledCode, I don't know why yet... But the changes still go into the change log in correct UTF-8 form, so that's just another bug, but not the real source of the problem. For getting some worms out of the can just browse inst var defs of converter in MultiByteFileStream: The accessor #converter initialize converter with TextConverter defaultSystemConverter which depends on LanguageEnvironment. That is a Latin1TextConverter in my latin image. Unless #reset is called first, in which case it will initialize with a UTF8TextConverter. Yes, but open: fileName forWrite: writeMode, does the job too with a UTF8TextConverter. You still follow? me neither. A better behaved is #setConverterForCode that should let non UTF-8 .mcz work in UTF-8 environment, but not sure if called where required... I think Yoshiki changes are necessary only for writing source code with character code 255. This was not the case of incriminated methods. SECOND POSSIBLE TRACK: Everything going to the change log pass thru the MultiByteFileStream, so how did non UTF-8 characters went in? I tried to follow two other clues: 1) There are senders of #primWrite:from:startingAt:count: not redefined in MultiByteFileStream... for example, using #next:putAll:startingAt: will bypass the converter. 2) using nextPutAll: with a ByteArray argument also does bypass the converter (See MultiByteFileStream#nextPutAll:) I did not find the senders (you really believe senders of nextPutAll: can be analyzed?). I tried to instrument code with Notification, but I'm unable to reproduce the problem, so that was vain... THIRD POSSIBLE TRACK: http://gforge.inria.fr/frs/download.php/22283/ Pharo0.1Core-10304cl.zip has the invalid UTF-8 problem, just before 10305 changes that introduced decompiled code... So we might attack the problem with another code snippet: (SystemNavigation default browseAllCallsOn: (Smalltalk associationAt: #SourceFiles))... Hmm, I might have a better clue now. The problem might possibly come from the condenseChanges in update10298. What happen in a condenseChanges? Changes are copied to this file: f := FileStream fileNamed: 'ST80.temp'. So far, so good, because the concreteStream is a MultiByteFileStream. But the end finishes with: SourceFiles at: 2 put: (StandardFileStream oldFileNamed: oldChanges name) Waouh, no MultiByteFileStream here, so no more UTF-8.
Re: [Pharo-project] invalid utf8 input detected
yes same here. On May 17, 2009, at 2:10 AM, Tudor Girba wrote: Hi, Recently I encounter a strange error: - I sometimes get a debugger due to some problems in my code - when I try to investigate the trace, I get another debugger saying that Invalid utf8 input detected' This second debugger I can investigate, the previous not. It looks like something got messed up with the text conversion of the sources. I am working on 10306 using the 4.1.1b2 VM on Mac. The code I am working on is loaded from squeaksource (Moose, Glamour, Mondrian). Anyone can confirm this problem? Cheers, Doru ERROR REPORT '17 May 2009 2:05:50 am VM: Mac OS - intel - 1056 - Squeak3.8.1 of ''28 Aug 2006'' [latest update: #6747] Squeak VM 4.1.1b2 Image: Pharo0.1 [Latest update: #10306] SecurityManager state: Restricted: false FileAccess: true SocketAccess: true Working Dir /Users/girba/Work/Code/squeakingmoose Trusted Dir /foobar/tooBar/forSqueak/bogus Untrusted Dir /Users/girba/Library/Preferences/Squeak/Internet/My Squeak UTF8TextConverter(Object)error: Receiver: an UTF8TextConverter Arguments and temporary variables: aString:''Invalid utf8 input detected'' Receiver''s instance variables: an UTF8TextConverter UTF8TextConvertererrorMalformedInput Receiver: an UTF8TextConverter Arguments and temporary variables: Receiver''s instance variables: an UTF8TextConverter UTF8TextConverternextFromStream: Receiver: an UTF8TextConverter Arguments and temporary variables: aStream:MultiByteFileStream: ''/Users/girba/Work/Code/ squeakingmoose/moose.chan...etc... character1: $ value1: 160 character2: Character tab value2: 9 unicode:nil character3: Character tab value3: 9 character4: nil value4: nil Receiver''s instance variables: an UTF8TextConverter MultiByteFileStreamnext Receiver: MultiByteFileStream: ''/Users/girba/Work/Code/ squeakingmoose/moose.changes'' Arguments and temporary variables: char: nil secondChar: nil state: nil Receiver''s instance variables: MultiByteFileStream(PositionableStream)nextChunk Receiver: MultiByteFileStream: ''/Users/girba/Work/Code/ squeakingmoose/moose.changes'' Arguments and temporary variables: terminator: $! out:a WriteStream ''doesNotUnderstand: aMessage Handle the fact that there ...etc... ch: Character cr Receiver''s instance variables: MultiByteFileStream(PositionableStream)nextChunkText Receiver: MultiByteFileStream: ''/Users/girba/Work/Code/ squeakingmoose/moose.changes'' Arguments and temporary variables: string: nil runsRaw:nil strm: nil runs: nil peek: nil pos:nil Receiver''s instance variables: [] in RemoteStringtext Receiver: a RemoteString Arguments and temporary variables: theFile:MultiByteFileStream: ''/Users/girba/Work/Code/ squeakingmoose/moose.chan...etc... Receiver''s instance variables: sourceFileNumber: 2 filePositionHi: 10007336 BlockClosureensure: Receiver: [closure] in RemoteStringtext Arguments and temporary variables: aBlock: [closure] in RemoteStringtext returnValue:nil b: nil Receiver''s instance variables: outerContext: RemoteStringtext startpc:72 numArgs:0 RemoteStringtext Receiver: a RemoteString Arguments and temporary variables: theFile:MultiByteFileStream: ''/Users/girba/Work/Code/ squeakingmoose/moose.chan...etc... Receiver''s instance variables: sourceFileNumber: 2 filePositionHi: 10007336 CompiledMethodgetSourceFromFile Receiver: a CompiledMethod (838) Arguments and temporary variables: position: 10007336 Receiver''s instance variables: a CompiledMethod (838) CompiledMethodmethodNode Receiver: a CompiledMethod (838) Arguments and temporary variables: aClass: Object source: nil Receiver''s instance variables: a CompiledMethod (838) [] in DebuggerMethodMap classforMethod: Receiver: DebuggerMethodMap Arguments and temporary variables: aMethod:a CompiledMethod (838) Receiver''s instance variables: superclass: Object
Re: [Pharo-project] invalid utf8 input detected
One solution would be to use getSource rather than getSourceFromFile. However, with following code I detected no problem in my pharo-core copy (10281 updated to 10306) | problems total | problems := OrderedCollection new. total := 0. SystemNavigation default allBehaviorsDo: [:cl | total := total + 1]. 'Searching UTF-8 Problems...' displayProgressAt: Sensor cursorPoint from: 0 to: total during: [:bar | | count | count := 0. SystemNavigation default allBehaviorsDo: [:cl | bar value: (count := count + 1). cl selectors do: [:sel | [(cl compiledMethodAt: sel) getSourceFromFile] ifError: [ var value: 'last problem found ' , cl name , '#' , sel. problems add: cl-sel. ^problems 2009/5/17 Stéphane Ducasse stephane.duca...@inria.fr: yes same here. On May 17, 2009, at 2:10 AM, Tudor Girba wrote: Hi, Recently I encounter a strange error: - I sometimes get a debugger due to some problems in my code - when I try to investigate the trace, I get another debugger saying that Invalid utf8 input detected' This second debugger I can investigate, the previous not. It looks like something got messed up with the text conversion of the sources. I am working on 10306 using the 4.1.1b2 VM on Mac. The code I am working on is loaded from squeaksource (Moose, Glamour, Mondrian). Anyone can confirm this problem? Cheers, Doru ERROR REPORT '17 May 2009 2:05:50 am VM: Mac OS - intel - 1056 - Squeak3.8.1 of ''28 Aug 2006'' [latest update: #6747] Squeak VM 4.1.1b2 Image: Pharo0.1 [Latest update: #10306] SecurityManager state: Restricted: false FileAccess: true SocketAccess: true Working Dir /Users/girba/Work/Code/squeakingmoose Trusted Dir /foobar/tooBar/forSqueak/bogus Untrusted Dir /Users/girba/Library/Preferences/Squeak/Internet/My Squeak UTF8TextConverter(Object)error: Receiver: an UTF8TextConverter Arguments and temporary variables: aString:''Invalid utf8 input detected'' Receiver''s instance variables: an UTF8TextConverter UTF8TextConvertererrorMalformedInput Receiver: an UTF8TextConverter Arguments and temporary variables: Receiver''s instance variables: an UTF8TextConverter UTF8TextConverternextFromStream: Receiver: an UTF8TextConverter Arguments and temporary variables: aStream:MultiByteFileStream: ''/Users/girba/Work/Code/ squeakingmoose/moose.chan...etc... character1: $ value1: 160 character2: Character tab value2: 9 unicode:nil character3: Character tab value3: 9 character4: nil value4: nil Receiver''s instance variables: an UTF8TextConverter MultiByteFileStreamnext Receiver: MultiByteFileStream: ''/Users/girba/Work/Code/ squeakingmoose/moose.changes'' Arguments and temporary variables: char: nil secondChar: nil state: nil Receiver''s instance variables: MultiByteFileStream(PositionableStream)nextChunk Receiver: MultiByteFileStream: ''/Users/girba/Work/Code/ squeakingmoose/moose.changes'' Arguments and temporary variables: terminator: $! out:a WriteStream ''doesNotUnderstand: aMessage Handle the fact that there ...etc... ch: Character cr Receiver''s instance variables: MultiByteFileStream(PositionableStream)nextChunkText Receiver: MultiByteFileStream: ''/Users/girba/Work/Code/ squeakingmoose/moose.changes'' Arguments and temporary variables: string: nil runsRaw:nil strm: nil runs: nil peek: nil pos:nil Receiver''s instance variables: [] in RemoteStringtext Receiver: a RemoteString Arguments and temporary variables: theFile:MultiByteFileStream: ''/Users/girba/Work/Code/ squeakingmoose/moose.chan...etc... Receiver''s instance variables: sourceFileNumber: 2 filePositionHi: 10007336 BlockClosureensure: Receiver: [closure] in RemoteStringtext Arguments and temporary variables: aBlock: [closure] in RemoteStringtext returnValue:nil b: nil Receiver''s instance variables: outerContext: RemoteStringtext startpc:72 numArgs:0 RemoteStringtext Receiver: a RemoteString Arguments and temporary
Re: [Pharo-project] invalid utf8 input detected
Nicolas when I run your script on the license looking for image I got using 10306cl I get the following error: VM: Mac OS - intel - 1056 - Squeak3.8.1 of '28 Aug 2006' [latest update: #6747] Squeak VM 4.1.1b2 Image: Pharo0.1 [Latest update: #10306] SecurityManager state: Restricted: false FileAccess: true SocketAccess: true Working Dir /Data/squeak4.0-relicenseTools/history Trusted Dir /foobar/tooBar/forSqueak/bogus Untrusted Dir /Users/ducasse/Library/Preferences/Squeak/Internet/My Squeak UndefinedObject(Object)doesNotUnderstand: #value: Receiver: nil Arguments and temporary variables: error during printing Receiver's instance variables: nil [] in [] in [] in [] in UndefinedObjectDoIt Receiver: nil Arguments and temporary variables: error during printing Receiver's instance variables: nil BlockClosurevalueWithPossibleArgs: Receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt Arguments and temporary variables: anArray:an Array('Error: Invalid utf8 input detected' an UTF8TextConverter) Receiver's instance variables: outerContext: [] in [] in [] in UndefinedObjectDoIt startpc:183 numArgs:0 [] in BlockClosureifError: Receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt Arguments and temporary variables: errorHandlerBlock: Error: Invalid utf8 input detected ex: [closure] in [] in [] in [] in UndefinedObjectDoIt Receiver's instance variables: outerContext: [] in [] in [] in UndefinedObjectDoIt startpc:171 numArgs:0 BlockClosurevalueWithPossibleArgs: Receiver: [closure] in BlockClosureifError: Arguments and temporary variables: anArray:an Array(Error: Invalid utf8 input detected) Receiver's instance variables: outerContext: BlockClosureifError: startpc:40 numArgs:1 [] in MethodContext(ContextPart)handleSignal: Receiver: BlockClosureon:do: Arguments and temporary variables: error during printing Receiver's instance variables: sender: BlockClosureifError: pc: 17 stackp: 3 method: a CompiledMethod (2306) closureOrNil: nil receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt BlockClosureensure: Receiver: [closure] in MethodContext(ContextPart)handleSignal: Arguments and temporary variables: aBlock: [closure] in MethodContext(ContextPart)handleSignal: returnValue:nil b: nil Receiver's instance variables: outerContext: MethodContext(ContextPart)handleSignal: startpc:90 numArgs:0 MethodContext(ContextPart)handleSignal: Receiver: BlockClosureon:do: Arguments and temporary variables: exception: Error: Invalid utf8 input detected val:nil Receiver's instance variables: sender: BlockClosureifError: pc: 17 stackp: 3 method: a CompiledMethod (2306) closureOrNil: nil receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt Error(Exception)signal Receiver: Error: Invalid utf8 input detected Arguments and temporary variables: Receiver's instance variables: messageText:'Invalid utf8 input detected' tag:nil signalContext: Error(Exception)signal handlerContext: BlockClosureon:do: outerContext: nil Error(Exception)signal: Receiver: Error: Invalid utf8 input detected Arguments and temporary variables: signalerText: 'Invalid utf8 input detected' Receiver's instance variables: messageText:'Invalid utf8 input detected' tag:nil signalContext: Error(Exception)signal handlerContext: BlockClosureon:do: outerContext: nil UTF8TextConverter(Object)error: Receiver: an UTF8TextConverter Arguments and temporary variables: aString:'Invalid utf8 input detected' Receiver's instance variables: an UTF8TextConverter UTF8TextConvertererrorMalformedInput Receiver: an UTF8TextConverter Arguments and temporary variables: Receiver's instance variables: an UTF8TextConverter UTF8TextConverternextFromStream: Receiver: an UTF8TextConverter Arguments and
Re: [Pharo-project] invalid utf8 input detected
doru do you succeed to reproduce that? Stef On May 17, 2009, at 2:10 AM, Tudor Girba wrote: Hi, Recently I encounter a strange error: - I sometimes get a debugger due to some problems in my code - when I try to investigate the trace, I get another debugger saying that Invalid utf8 input detected' This second debugger I can investigate, the previous not. It looks like something got messed up with the text conversion of the sources. I am working on 10306 using the 4.1.1b2 VM on Mac. The code I am working on is loaded from squeaksource (Moose, Glamour, Mondrian). Anyone can confirm this problem? Cheers, Doru ERROR REPORT '17 May 2009 2:05:50 am VM: Mac OS - intel - 1056 - Squeak3.8.1 of ''28 Aug 2006'' [latest update: #6747] Squeak VM 4.1.1b2 Image: Pharo0.1 [Latest update: #10306] SecurityManager state: Restricted: false FileAccess: true SocketAccess: true Working Dir /Users/girba/Work/Code/squeakingmoose Trusted Dir /foobar/tooBar/forSqueak/bogus Untrusted Dir /Users/girba/Library/Preferences/Squeak/Internet/My Squeak UTF8TextConverter(Object)error: Receiver: an UTF8TextConverter Arguments and temporary variables: aString:''Invalid utf8 input detected'' Receiver''s instance variables: an UTF8TextConverter UTF8TextConvertererrorMalformedInput Receiver: an UTF8TextConverter Arguments and temporary variables: Receiver''s instance variables: an UTF8TextConverter UTF8TextConverternextFromStream: Receiver: an UTF8TextConverter Arguments and temporary variables: aStream:MultiByteFileStream: ''/Users/girba/Work/Code/ squeakingmoose/moose.chan...etc... character1: $ value1: 160 character2: Character tab value2: 9 unicode:nil character3: Character tab value3: 9 character4: nil value4: nil Receiver''s instance variables: an UTF8TextConverter MultiByteFileStreamnext Receiver: MultiByteFileStream: ''/Users/girba/Work/Code/ squeakingmoose/moose.changes'' Arguments and temporary variables: char: nil secondChar: nil state: nil Receiver''s instance variables: MultiByteFileStream(PositionableStream)nextChunk Receiver: MultiByteFileStream: ''/Users/girba/Work/Code/ squeakingmoose/moose.changes'' Arguments and temporary variables: terminator: $! out:a WriteStream ''doesNotUnderstand: aMessage Handle the fact that there ...etc... ch: Character cr Receiver''s instance variables: MultiByteFileStream(PositionableStream)nextChunkText Receiver: MultiByteFileStream: ''/Users/girba/Work/Code/ squeakingmoose/moose.changes'' Arguments and temporary variables: string: nil runsRaw:nil strm: nil runs: nil peek: nil pos:nil Receiver''s instance variables: [] in RemoteStringtext Receiver: a RemoteString Arguments and temporary variables: theFile:MultiByteFileStream: ''/Users/girba/Work/Code/ squeakingmoose/moose.chan...etc... Receiver''s instance variables: sourceFileNumber: 2 filePositionHi: 10007336 BlockClosureensure: Receiver: [closure] in RemoteStringtext Arguments and temporary variables: aBlock: [closure] in RemoteStringtext returnValue:nil b: nil Receiver''s instance variables: outerContext: RemoteStringtext startpc:72 numArgs:0 RemoteStringtext Receiver: a RemoteString Arguments and temporary variables: theFile:MultiByteFileStream: ''/Users/girba/Work/Code/ squeakingmoose/moose.chan...etc... Receiver''s instance variables: sourceFileNumber: 2 filePositionHi: 10007336 CompiledMethodgetSourceFromFile Receiver: a CompiledMethod (838) Arguments and temporary variables: position: 10007336 Receiver''s instance variables: a CompiledMethod (838) CompiledMethodmethodNode Receiver: a CompiledMethod (838) Arguments and temporary variables: aClass: Object source: nil Receiver''s instance variables: a CompiledMethod (838) [] in DebuggerMethodMap classforMethod: Receiver: DebuggerMethodMap Arguments and temporary variables: aMethod:a CompiledMethod (838) Receiver''s instance variables:
Re: [Pharo-project] invalid utf8 input detected
Sure, a key stroke error, it's bar value:, not var value:, This @!* workspace takes it as global without a warning 2009/5/17 Stéphane Ducasse stephane.duca...@inria.fr: Nicolas when I run your script on the license looking for image I got using 10306cl I get the following error: VM: Mac OS - intel - 1056 - Squeak3.8.1 of '28 Aug 2006' [latest update: #6747] Squeak VM 4.1.1b2 Image: Pharo0.1 [Latest update: #10306] SecurityManager state: Restricted: false FileAccess: true SocketAccess: true Working Dir /Data/squeak4.0-relicenseTools/history Trusted Dir /foobar/tooBar/forSqueak/bogus Untrusted Dir /Users/ducasse/Library/Preferences/Squeak/Internet/My Squeak UndefinedObject(Object)doesNotUnderstand: #value: Receiver: nil Arguments and temporary variables: error during printing Receiver's instance variables: nil [] in [] in [] in [] in UndefinedObjectDoIt Receiver: nil Arguments and temporary variables: error during printing Receiver's instance variables: nil BlockClosurevalueWithPossibleArgs: Receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt Arguments and temporary variables: anArray:an Array('Error: Invalid utf8 input detected' an UTF8TextConverter) Receiver's instance variables: outerContext: [] in [] in [] in UndefinedObjectDoIt startpc:183 numArgs:0 [] in BlockClosureifError: Receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt Arguments and temporary variables: errorHandlerBlock: Error: Invalid utf8 input detected ex: [closure] in [] in [] in [] in UndefinedObjectDoIt Receiver's instance variables: outerContext: [] in [] in [] in UndefinedObjectDoIt startpc:171 numArgs:0 BlockClosurevalueWithPossibleArgs: Receiver: [closure] in BlockClosureifError: Arguments and temporary variables: anArray:an Array(Error: Invalid utf8 input detected) Receiver's instance variables: outerContext: BlockClosureifError: startpc:40 numArgs:1 [] in MethodContext(ContextPart)handleSignal: Receiver: BlockClosureon:do: Arguments and temporary variables: error during printing Receiver's instance variables: sender: BlockClosureifError: pc: 17 stackp: 3 method: a CompiledMethod (2306) closureOrNil: nil receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt BlockClosureensure: Receiver: [closure] in MethodContext(ContextPart)handleSignal: Arguments and temporary variables: aBlock: [closure] in MethodContext(ContextPart)handleSignal: returnValue:nil b: nil Receiver's instance variables: outerContext: MethodContext(ContextPart)handleSignal: startpc:90 numArgs:0 MethodContext(ContextPart)handleSignal: Receiver: BlockClosureon:do: Arguments and temporary variables: exception: Error: Invalid utf8 input detected val:nil Receiver's instance variables: sender: BlockClosureifError: pc: 17 stackp: 3 method: a CompiledMethod (2306) closureOrNil: nil receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt Error(Exception)signal Receiver: Error: Invalid utf8 input detected Arguments and temporary variables: Receiver's instance variables: messageText:'Invalid utf8 input detected' tag:nil signalContext: Error(Exception)signal handlerContext: BlockClosureon:do: outerContext: nil Error(Exception)signal: Receiver: Error: Invalid utf8 input detected Arguments and temporary variables: signalerText: 'Invalid utf8 input detected' Receiver's instance variables: messageText:'Invalid utf8 input detected' tag:nil signalContext: Error(Exception)signal handlerContext: BlockClosureon:do: outerContext: nil UTF8TextConverter(Object)error: Receiver: an UTF8TextConverter Arguments and temporary variables: aString:'Invalid utf8 input detected' Receiver's instance variables: an UTF8TextConverter UTF8TextConvertererrorMalformedInput Receiver: an UTF8TextConverter
Re: [Pharo-project] invalid utf8 input detected
There's something weird... If you hit var (UndefinedObject) doesNotUnderstand: #value: that means there were a problem the first time. Unfortunately, due to bug in MethodContext tempNames, we don't know the class and selector guilty. From the set of selectors I can see this is Object. From the source file position, I cannot say anything because I do not have same change log history (sorry, own image). You could try (SourceFiles at: 2) readOnlyCopy position: 10007336; nextChunk 2009/5/17 Stéphane Ducasse stephane.duca...@inria.fr: sorry for not checking either. When I run this code I indeed do not have a problem on 10306cl stef On May 17, 2009, at 11:36 AM, Nicolas Cellier wrote: Sure, a key stroke error, it's bar value:, not var value:, This @!* workspace takes it as global without a warning 2009/5/17 Stéphane Ducasse stephane.duca...@inria.fr: Nicolas when I run your script on the license looking for image I got using 10306cl I get the following error: VM: Mac OS - intel - 1056 - Squeak3.8.1 of '28 Aug 2006' [latest update: #6747] Squeak VM 4.1.1b2 Image: Pharo0.1 [Latest update: #10306] SecurityManager state: Restricted: false FileAccess: true SocketAccess: true Working Dir /Data/squeak4.0-relicenseTools/history Trusted Dir /foobar/tooBar/forSqueak/bogus Untrusted Dir /Users/ducasse/Library/Preferences/Squeak/Internet/My Squeak UndefinedObject(Object)doesNotUnderstand: #value: Receiver: nil Arguments and temporary variables: error during printing Receiver's instance variables: nil [] in [] in [] in [] in UndefinedObjectDoIt Receiver: nil Arguments and temporary variables: error during printing Receiver's instance variables: nil BlockClosurevalueWithPossibleArgs: Receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt Arguments and temporary variables: anArray:an Array('Error: Invalid utf8 input detected' an UTF8TextConverter) Receiver's instance variables: outerContext: [] in [] in [] in UndefinedObjectDoIt startpc:183 numArgs:0 [] in BlockClosureifError: Receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt Arguments and temporary variables: errorHandlerBlock: Error: Invalid utf8 input detected ex: [closure] in [] in [] in [] in UndefinedObjectDoIt Receiver's instance variables: outerContext: [] in [] in [] in UndefinedObjectDoIt startpc:171 numArgs:0 BlockClosurevalueWithPossibleArgs: Receiver: [closure] in BlockClosureifError: Arguments and temporary variables: anArray:an Array(Error: Invalid utf8 input detected) Receiver's instance variables: outerContext: BlockClosureifError: startpc:40 numArgs:1 [] in MethodContext(ContextPart)handleSignal: Receiver: BlockClosureon:do: Arguments and temporary variables: error during printing Receiver's instance variables: sender: BlockClosureifError: pc: 17 stackp: 3 method: a CompiledMethod (2306) closureOrNil: nil receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt BlockClosureensure: Receiver: [closure] in MethodContext(ContextPart)handleSignal: Arguments and temporary variables: aBlock: [closure] in MethodContext(ContextPart)handleSignal: returnValue:nil b: nil Receiver's instance variables: outerContext: MethodContext(ContextPart)handleSignal: startpc:90 numArgs:0 MethodContext(ContextPart)handleSignal: Receiver: BlockClosureon:do: Arguments and temporary variables: exception: Error: Invalid utf8 input detected val:nil Receiver's instance variables: sender: BlockClosureifError: pc: 17 stackp: 3 method: a CompiledMethod (2306) closureOrNil: nil receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt Error(Exception)signal Receiver: Error: Invalid utf8 input detected Arguments and temporary variables: Receiver's instance variables: messageText:'Invalid utf8 input detected' tag:nil signalContext: Error(Exception)signal handlerContext: BlockClosureon:do: outerContext: nil Error(Exception)signal: Receiver: Error: Invalid utf8 input detected Arguments and temporary variables:
Re: [Pharo-project] invalid utf8 input detected
Hi, I ran the snippet you sent on both 304cl and 306cl and I get the following list: Object-#doesNotUnderstand: SystemNavigation-#browseMethodsWhoseNamesContain: Utilities class-#changeStampPerSe Utilities class-#methodsWithInitials: Indeed, most of the annoyances are due to the ObjectdoesNotUnderstand: because when I get a DNU I am stuck (and I feel like in Java :)). I am not sure I understand if there is a fix to the problem. Cheers, Doru On 17 May 2009, at 12:06, Nicolas Cellier wrote: There's something weird... If you hit var (UndefinedObject) doesNotUnderstand: #value: that means there were a problem the first time. Unfortunately, due to bug in MethodContext tempNames, we don't know the class and selector guilty. From the set of selectors I can see this is Object. From the source file position, I cannot say anything because I do not have same change log history (sorry, own image). You could try (SourceFiles at: 2) readOnlyCopy position: 10007336; nextChunk 2009/5/17 Stéphane Ducasse stephane.duca...@inria.fr: sorry for not checking either. When I run this code I indeed do not have a problem on 10306cl stef On May 17, 2009, at 11:36 AM, Nicolas Cellier wrote: Sure, a key stroke error, it's bar value:, not var value:, This @!* workspace takes it as global without a warning 2009/5/17 Stéphane Ducasse stephane.duca...@inria.fr: Nicolas when I run your script on the license looking for image I got using 10306cl I get the following error: VM: Mac OS - intel - 1056 - Squeak3.8.1 of '28 Aug 2006' [latest update: #6747] Squeak VM 4.1.1b2 Image: Pharo0.1 [Latest update: #10306] SecurityManager state: Restricted: false FileAccess: true SocketAccess: true Working Dir /Data/squeak4.0-relicenseTools/history Trusted Dir /foobar/tooBar/forSqueak/bogus Untrusted Dir /Users/ducasse/Library/Preferences/Squeak/Internet/My Squeak UndefinedObject(Object)doesNotUnderstand: #value: Receiver: nil Arguments and temporary variables: error during printing Receiver's instance variables: nil [] in [] in [] in [] in UndefinedObjectDoIt Receiver: nil Arguments and temporary variables: error during printing Receiver's instance variables: nil BlockClosurevalueWithPossibleArgs: Receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt Arguments and temporary variables: anArray:an Array('Error: Invalid utf8 input detected' an UTF8TextConverter) Receiver's instance variables: outerContext: [] in [] in [] in UndefinedObjectDoIt startpc:183 numArgs:0 [] in BlockClosureifError: Receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt Arguments and temporary variables: errorHandlerBlock: Error: Invalid utf8 input detected ex: [closure] in [] in [] in [] in UndefinedObjectDoIt Receiver's instance variables: outerContext: [] in [] in [] in UndefinedObjectDoIt startpc:171 numArgs:0 BlockClosurevalueWithPossibleArgs: Receiver: [closure] in BlockClosureifError: Arguments and temporary variables: anArray:an Array(Error: Invalid utf8 input detected) Receiver's instance variables: outerContext: BlockClosureifError: startpc:40 numArgs:1 [] in MethodContext(ContextPart)handleSignal: Receiver: BlockClosureon:do: Arguments and temporary variables: error during printing Receiver's instance variables: sender: BlockClosureifError: pc: 17 stackp: 3 method: a CompiledMethod (2306) closureOrNil: nil receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt BlockClosureensure: Receiver: [closure] in MethodContext(ContextPart)handleSignal: Arguments and temporary variables: aBlock: [closure] in MethodContext(ContextPart)handleSignal: returnValue:nil b: nil Receiver's instance variables: outerContext: MethodContext(ContextPart)handleSignal: startpc:90 numArgs:0 MethodContext(ContextPart)handleSignal: Receiver: BlockClosureon:do: Arguments and temporary variables: exception: Error: Invalid utf8 input detected val:nil Receiver's instance variables: sender: BlockClosureifError: pc: 17 stackp: 3 method: a CompiledMethod (2306) closureOrNil: nil receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt Error(Exception)signal Receiver: Error: Invalid utf8 input detected Arguments and temporary variables: Receiver's instance variables: messageText:'Invalid utf8 input detected' tag:nil
Re: [Pharo-project] invalid utf8 input detected
OK, {Object#doesNotUnderstand:. SystemNavigation#browseMethodsWhoseNamesContain:. Utilities class#changeStampPerSe. Utilities class#methodsWithInitials:} collect: [:e | e getSourceFromFile]. does not fail for me, BUT all these sources look like decompileString. I guess this is dating from the condenseChanges that occured in #update10298 Change log prior to this update should have the problem. Nicolas 2009/5/17 Tudor Girba gi...@iam.unibe.ch: Hi, I ran the snippet you sent on both 304cl and 306cl and I get the following list: Object-#doesNotUnderstand: SystemNavigation-#browseMethodsWhoseNamesContain: Utilities class-#changeStampPerSe Utilities class-#methodsWithInitials: Indeed, most of the annoyances are due to the ObjectdoesNotUnderstand: because when I get a DNU I am stuck (and I feel like in Java :)). I am not sure I understand if there is a fix to the problem. Cheers, Doru On 17 May 2009, at 12:06, Nicolas Cellier wrote: There's something weird... If you hit var (UndefinedObject) doesNotUnderstand: #value: that means there were a problem the first time. Unfortunately, due to bug in MethodContext tempNames, we don't know the class and selector guilty. From the set of selectors I can see this is Object. From the source file position, I cannot say anything because I do not have same change log history (sorry, own image). You could try (SourceFiles at: 2) readOnlyCopy position: 10007336; nextChunk 2009/5/17 Stéphane Ducasse stephane.duca...@inria.fr: sorry for not checking either. When I run this code I indeed do not have a problem on 10306cl stef On May 17, 2009, at 11:36 AM, Nicolas Cellier wrote: Sure, a key stroke error, it's bar value:, not var value:, This @!* workspace takes it as global without a warning 2009/5/17 Stéphane Ducasse stephane.duca...@inria.fr: Nicolas when I run your script on the license looking for image I got using 10306cl I get the following error: VM: Mac OS - intel - 1056 - Squeak3.8.1 of '28 Aug 2006' [latest update: #6747] Squeak VM 4.1.1b2 Image: Pharo0.1 [Latest update: #10306] SecurityManager state: Restricted: false FileAccess: true SocketAccess: true Working Dir /Data/squeak4.0-relicenseTools/history Trusted Dir /foobar/tooBar/forSqueak/bogus Untrusted Dir /Users/ducasse/Library/Preferences/Squeak/Internet/My Squeak UndefinedObject(Object)doesNotUnderstand: #value: Receiver: nil Arguments and temporary variables: error during printing Receiver's instance variables: nil [] in [] in [] in [] in UndefinedObjectDoIt Receiver: nil Arguments and temporary variables: error during printing Receiver's instance variables: nil BlockClosurevalueWithPossibleArgs: Receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt Arguments and temporary variables: anArray:an Array('Error: Invalid utf8 input detected' an UTF8TextConverter) Receiver's instance variables: outerContext: [] in [] in [] in UndefinedObjectDoIt startpc:183 numArgs:0 [] in BlockClosureifError: Receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt Arguments and temporary variables: errorHandlerBlock: Error: Invalid utf8 input detected ex: [closure] in [] in [] in [] in UndefinedObjectDoIt Receiver's instance variables: outerContext: [] in [] in [] in UndefinedObjectDoIt startpc:171 numArgs:0 BlockClosurevalueWithPossibleArgs: Receiver: [closure] in BlockClosureifError: Arguments and temporary variables: anArray:an Array(Error: Invalid utf8 input detected) Receiver's instance variables: outerContext: BlockClosureifError: startpc:40 numArgs:1 [] in MethodContext(ContextPart)handleSignal: Receiver: BlockClosureon:do: Arguments and temporary variables: error during printing Receiver's instance variables: sender: BlockClosureifError: pc: 17 stackp: 3 method: a CompiledMethod (2306) closureOrNil: nil receiver: [closure] in [] in [] in [] in UndefinedObjectDoIt BlockClosureensure: Receiver: [closure] in MethodContext(ContextPart)handleSignal: Arguments and temporary variables: aBlock: [closure] in MethodContext(ContextPart)handleSignal: returnValue:nil b: nil Receiver's instance variables: outerContext: MethodContext(ContextPart)handleSignal: startpc:90 numArgs:0 MethodContext(ContextPart)handleSignal: Receiver: BlockClosureon:do: Arguments and temporary variables: exception: Error: Invalid utf8 input detected val:nil Receiver's instance variables: sender: BlockClosureifError:
Re: [Pharo-project] invalid utf8 input detected
On May 17, 2009, at 9:42 PM, Nicolas Cellier wrote: Just to remind my change was not a fix, just a workaround. We have to discover why these non UTF-8 sources got into the change file and cure the problem. Otherwise we might suffer a plague of decompiled code spreading in our browsers :( Yes! Stef ___ Pharo-project mailing list Pharo-project@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Re: [Pharo-project] invalid utf8 input detected
I've seen this as well in 10306 on Linux. You can switch to the old debugger to avoid hitting the problem (but you are just avoiding the problem). The problem is with RemoteString and multiByte file and I thought that that problem had been solved ... perhaps not for all cases? If this is a problem for a number of folks, I could see if I can work out a workaround in the OTDebugger until the underlying problem is fixed. Dale - Tudor Girba gi...@iam.unibe.ch wrote: | Hi, | | Recently I encounter a strange error: | - I sometimes get a debugger due to some problems in my code | - when I try to investigate the trace, I get another debugger saying | | that Invalid utf8 input detected' | | This second debugger I can investigate, the previous not. It looks | like something got messed up with the text conversion of the sources. | | I am working on 10306 using the 4.1.1b2 VM on Mac. The code I am | working on is loaded from squeaksource (Moose, Glamour, Mondrian). | | Anyone can confirm this problem? | | Cheers, | Doru | | | ERROR REPORT | | '17 May 2009 2:05:50 am | | VM: Mac OS - intel - 1056 - Squeak3.8.1 of ''28 Aug 2006'' [latest | update: #6747] Squeak VM 4.1.1b2 | Image: Pharo0.1 [Latest update: #10306] | | SecurityManager state: | Restricted: false | FileAccess: true | SocketAccess: true | Working Dir /Users/girba/Work/Code/squeakingmoose | Trusted Dir /foobar/tooBar/forSqueak/bogus | Untrusted Dir /Users/girba/Library/Preferences/Squeak/Internet/My | Squeak | | UTF8TextConverter(Object)error: | Receiver: an UTF8TextConverter | Arguments and temporary variables: | aString:''Invalid utf8 input detected'' | Receiver''s instance variables: | an UTF8TextConverter | | UTF8TextConvertererrorMalformedInput | Receiver: an UTF8TextConverter | Arguments and temporary variables: | | Receiver''s instance variables: | an UTF8TextConverter | | UTF8TextConverternextFromStream: | Receiver: an UTF8TextConverter | Arguments and temporary variables: | aStream:MultiByteFileStream: ''/Users/girba/Work/Code/ | squeakingmoose/moose.chan...etc... | character1: $ | value1: 160 | character2: Character tab | value2: 9 | unicode:nil | character3: Character tab | value3: 9 | character4: nil | value4: nil | Receiver''s instance variables: | an UTF8TextConverter | | MultiByteFileStreamnext | Receiver: MultiByteFileStream: ''/Users/girba/Work/Code/ | squeakingmoose/moose.changes'' | Arguments and temporary variables: | char: nil | secondChar: nil | state: nil | Receiver''s instance variables: | | | MultiByteFileStream(PositionableStream)nextChunk | Receiver: MultiByteFileStream: ''/Users/girba/Work/Code/ | squeakingmoose/moose.changes'' | Arguments and temporary variables: | terminator: $! | out:a WriteStream ''doesNotUnderstand: aMessage |Handle the fact that there ...etc... | ch: Character cr | Receiver''s instance variables: | | | MultiByteFileStream(PositionableStream)nextChunkText | Receiver: MultiByteFileStream: ''/Users/girba/Work/Code/ | squeakingmoose/moose.changes'' | Arguments and temporary variables: | string: nil | runsRaw:nil | strm: nil | runs: nil | peek: nil | pos:nil | Receiver''s instance variables: | | | [] in RemoteStringtext | Receiver: a RemoteString | Arguments and temporary variables: | theFile:MultiByteFileStream: ''/Users/girba/Work/Code/ | squeakingmoose/moose.chan...etc... | Receiver''s instance variables: | sourceFileNumber: 2 | filePositionHi: 10007336 | | BlockClosureensure: | Receiver: [closure] in RemoteStringtext | Arguments and temporary variables: | aBlock: [closure] in RemoteStringtext | returnValue:nil | b: nil | Receiver''s instance variables: | outerContext: RemoteStringtext | startpc:72 | numArgs:0 | | RemoteStringtext | Receiver: a RemoteString | Arguments and temporary variables: | theFile:MultiByteFileStream: ''/Users/girba/Work/Code/ | squeakingmoose/moose.chan...etc... | Receiver''s instance variables: | sourceFileNumber: 2 | filePositionHi: 10007336 | | CompiledMethodgetSourceFromFile | Receiver: a CompiledMethod (838) | Arguments and temporary variables: | position: