On May 23, 2009, at 7:57 PM, Nicolas Cellier wrote:

> I confirm the scenario:
> 1) update10298 condenseChanges that let (SourceFiles at: 2) class =
> StandardFileStream
>   This is the seed of further problems, because further changes will
> be encoded in latin1 (or MacRoman I don't really wnt to know)
> 2) update10302 changes the methods with non ASCII characters
> 3) Stef save the image after update10304, that does reopen
> (SourceFiles at: 2) in UTF-8, but that's too late, the worm is in the
> apple.
>
> If you save the image just after the condenseChanges, no problem
> because (SourceFiles at: 2) is opened in Latin1 AFTER all the changes
> have gotten into it, and reopened UTF-8 before any changes got into
> it.
> We must track undue usage of StandardFileStream such as  
> #condenseChanges.


Ok now we cannot really rollback the changes and I fixed the methods
that were leading to invalid UTF. But it means that we should check  
the StandardFileStream
usage.
I"m doing some experiences with umejava code

Stef
>
>
> 2009/5/23 Nicolas Cellier <nicolas.cellier.aka.n...@gmail.com>:
>> What happened exactly is very hard to trace because these FileStream
>> are a can of worms...
>> Here are some of my perigrinations:
>>
>> FIRST POSSIBLE TRACK:
>>
>> All methods were changed in 10305.
>> Monticello snapshot/source.st is not UTF-8.
>> If the file is opened UTF-8, then we get decompiledCode, I don't  
>> know why yet...
>> But the changes still go into the change log in correct UTF-8 form,  
>> so
>> that's just another bug, but not the real source of the problem.
>> For getting some worms out of the can just browse inst var defs of
>> converter in MultiByteFileStream:
>> The accessor #converter initialize converter with TextConverter
>> defaultSystemConverter which depends on LanguageEnvironment.
>> That is a Latin1TextConverter in my latin image.
>> Unless #reset is called first, in which case it will initialize  
>> with a
>> UTF8TextConverter.
>> Yes, but open: fileName forWrite: writeMode, does the job too with a
>> UTF8TextConverter.
>> You still follow? me neither.
>> A better behaved is #setConverterForCode that should let non UTF-8
>> .mcz work in UTF-8 environment, but not sure if called where
>> required...
>> I think Yoshiki changes are necessary only for writing source code
>> with character code > 255.
>> This was not the case of incriminated methods.
>>
>> SECOND POSSIBLE TRACK:
>>
>> Everything going to the change log pass thru the MultiByteFileStream,
>> so how did non UTF-8 characters went in?
>> I tried to follow two other clues:
>> 1) There are senders of #primWrite:from:startingAt:count: not
>> redefined in MultiByteFileStream...
>> for example, using #next:putAll:startingAt: will bypass the  
>> converter.
>> 2) using nextPutAll: with a ByteArray argument also does bypass the
>> converter (See MultiByteFileStream>>#nextPutAll:)
>> I did not find the senders (you really believe senders of nextPutAll:
>> can be analyzed?).
>> I tried to instrument code with Notification, but I'm unable to
>> reproduce the problem, so that was vain...
>>
>> THIRD POSSIBLE TRACK:
>>
>> http://gforge.inria.fr/frs/download.php/22283/ 
>> Pharo0.1Core-10304cl.zip
>> has the invalid UTF-8 problem, just before 10305 changes that
>> introduced decompiled code...
>> So we might attack the problem with another code snippet:
>>
>> (SystemNavigation default browseAllCallsOn: (Smalltalk associationAt:
>> #SourceFiles))...
>>
>> Hmm, I might have a better clue now.
>> The problem might possibly come from the condenseChanges in  
>> update10298.
>> What happen in a condenseChanges?
>> Changes are copied to this file:
>>
>> f := FileStream fileNamed: 'ST80.temp'.
>>
>> So far, so good, because the concreteStream is a MultiByteFileStream.
>>
>> But the end finishes with:
>>
>>       SourceFiles
>>               at: 2
>>               put: (StandardFileStream oldFileNamed: oldChanges name)
>>
>> Waouh, no MultiByteFileStream here, so no more UTF-8.
>> But hey, that would be the inverse problem: reading UTF-8 text with
>> latin1 reader: I can't get an error doing this, only some strange
>> sequence of characters... (The UTF-8 encoding)...
>> Unless incriminated methods are further changed in #script376 or any
>> other method... In which case they are written in latin1 in the
>> changeLog...
>> Hmm... That could be the case eventually. We must restart update
>> process from 
>> http://gforge.inria.fr/frs/download.php/22167/Pharo0.1Core-10296cl-2.zip
>>
>> One thing is sure, at next returnFromSnapshot, FileDirectory
>> class>>startup will reopen changes UTF-8.
>> So saving the image will reopen UTF-8...
>>
>> But wait... Maybe we get enough pieces of the puzzle:
>> Analyzing the Pharo0.1Core-10304cl.changes tells that Stephane  
>> applied
>> several updates before snapshoting the image. So if Kernel and
>> System-Support are changed between 10298 and 10304, then we get the
>> explanation:
>> - condense changes put all in the .changes in UTF-8 but reopen the
>> changes in latin1
>> - further updates up to 10304 write changes in latin1
>> - image snapshot reopen changes in UTF-8 and thus we get further
>> invalid UTF-8...
>>
>> That's easy to reproduce. Stef, can you confirm?
>>
>> That also explain why I did not get the problem at home: I update
>> early and always save my image after.
>> After that we still have to detect and clean while Monticello sources
>> are interpreted UTF-8 when they should not (FIRST TRACK) , and
>> eventually make source code go UTF-8 in Monticello, so that non latin
>> programmers can use their favourite language eventually...
>>
>> Nicolas
>>
>> 2009/5/23 Stéphane Ducasse <stephane.duca...@inria.fr>:
>>> No problem I never interpreted it like that.
>>> Me too I want a system that is working
>>>
>>> Adrian I will publish a fix for DNU now
>>> and I will try later to check the fixes proposed by yoshiki
>>>
>>> stef
>>>
>>> On May 23, 2009, at 1:29 PM, Tudor Girba wrote:
>>>
>>>> Actually, the fix is even simpler: if you find a method that raises
>>>> "invalid utf8 input detected", just browse to it with a class  
>>>> browser,
>>>> and re-accept it :).
>>>>
>>>> With my previous mail, I was not implying that someone should fix  
>>>> it
>>>> for me, I was merely asking for what could a quick solution be,
>>>> because I was a bit lost (scared) :). Now, I am happy. Thanks for
>>>> discussing it.
>>>>
>>>> Cheers,
>>>> Doru
>>>>
>>>> On 23 May 2009, at 13:07, Tudor Girba wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I attached here a DNU implementation I took from an older image.
>>>>> After filing this one in, I can debug DNU problems.
>>>>>
>>>>> Cheers,
>>>>> Doru
>>>>>
>>>>> <Object-doesNotUnderstand.st>
>>>>>
>>>>>
>>>>>
>>>>> On 23 May 2009, at 13:04, Stéphane Ducasse wrote:
>>>>>
>>>>>> I did the following
>>>>>>
>>>>>> (Object>>#doesNotUNderstand) getSourceFromFile and I get an
>>>>>> invalid....
>>>>>>
>>>>>> Now when I take another method
>>>>>>
>>>>>> (BalloonFontTest>>#testDefaultFont) I do not get problem.
>>>>>>
>>>>>> I will reread carefully the mails of nicolas to try to  
>>>>>> understand,
>>>>>> I do not know if the fixes of yoh
>>>>>>
>>>>>>    http://bugs.squeak.org/view.php?id=5996
>>>>>> is related.
>>>>>>
>>>>>> Nicolas
>>>>>>
>>>>>>>> {Object>>#doesNotUnderstand:.
>>>>>>>> SystemNavigation>>#browseMethodsWhoseNamesContain:.
>>>>>>>> Utilities class>>#changeStampPerSe.
>>>>>>>> Utilities class>>#methodsWithInitials:} collect: [:e | (e
>>>>>>>> getSourceFromFile select: [:s | s charCode > 127]) asArray
>>>>>>>> collect:
>>>>>>>> [:c | c charCode]]
>>>>>>
>>>>>> I cannot get that code running it break before with me.
>>>>>>
>>>>>> Stef
>>>>>>
>>>>>> _______________________________________________
>>>>>> Pharo-project mailing list
>>>>>> Pharo-project@lists.gforge.inria.fr
>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>>
>>>>> --
>>>>> www.tudorgirba.com
>>>>>
>>>>> "Not knowing how to do something is not an argument for how it
>>>>> cannot be done."
>>>>>
>>>>> _______________________________________________
>>>>> Pharo-project mailing list
>>>>> Pharo-project@lists.gforge.inria.fr
>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo- 
>>>>> project
>>>>
>>>> --
>>>> www.tudorgirba.com
>>>>
>>>> "Problem solving efficiency grows with the abstractness level of
>>>> problem understanding."
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> Pharo-project@lists.gforge.inria.fr
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>
>>>
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> Pharo-project@lists.gforge.inria.fr
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>
>
> _______________________________________________
> Pharo-project mailing list
> Pharo-project@lists.gforge.inria.fr
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>


_______________________________________________
Pharo-project mailing list
Pharo-project@lists.gforge.inria.fr
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Reply via email to