Sergiu Dumitriu wrote:
> On 04/23/2010 08:50 AM, Denis Gervalle wrote:
>> On Fri, Apr 23, 2010 at 03:32, Sergiu Dumitriu<[email protected]>  wrote:
>>
>>> On 04/06/2010 05:03 PM, Vincent Massol wrote:
>>>> Hi Milind,
>>>>
>>>> On Apr 6, 2010, at 5:00 PM, Milind Kamble wrote:
>>>>
>>>>> Denis,
>>>>>     I understand your point that XE being used globally, needs to support
>>> more than Ascii char set.
>>>>> While the new reference model matures, could you clarify if underscore
>>> in a file name would break the functionality under the current model where
>>> attachment name is used as a reference for attachments? If not, would it be
>>> possible to eliminate the stripping of just the underscore chars and push
>>> that fix in the next XE release -- I am OK with space chars getting stripped
>>> off.
>>>> I don't think that underscores are a problem even with the old "reference
>>> as string" code. Actually I don't even know why we're stripping them. Sergiu
>>> might know more. Any idea Sergiu?
>>>
>>> This is the issue that started it: XWIKI-2087
>>>
>>> So, there were three main problems:
>>>
>>> 1. Impossible to actually restore the attachment from the database since
>>> the ID was generated using the hash of the original, correct name, yet
>>> it was stored using the broken name, with ? instead of non-latin1
>>> characters
>>> 2. Impossible to link to such an attachment, since a non-UTF wiki would
>>> encode non-ASCII chars to their&#xyz; escapes, and the filename wasn't
>>> decoded when trying to get the attachment from the database
>>> 3. Encoding bug in the old WYSIWYG which composed the URL using a wrong
>>> encoding
>>>
>>> 3 should be fixed since we're forcing UTF-8 in URLs.
>>> 2 and 1 should work if the wiki+database are using UTF8, but they might
>>> still fail in latin1.
>>
>> Should we really support non-UTF-8 configuration ? We have already lost so
>> much time with these encoding issues, and I really do not understand the
>> advantage of supporting non-UTF8 environment ?
>>
> 
> Legacy. Maybe if we can provide a nice and quick guide for transforming 
> a latinX installation into an UTF-8, we'd be allowed to require UTF-8. 
> We could announce that from 2.5 onwards UTF-8 will be mandatory, if we 
> decide to go this way. Maybe the most important latin1 installation is 
> xwiki.org itself.
> 
> The most problematic thing is that by default mysql databases come as 
> latin1 (in most distributions, although my Gentoo makes it utf8), and 
> this is one of the most frequent source of encoding problem reports.

Am I correct in saying that mysql with utf8 is unable to handle some
characters and so pages can't be saved? My understanding is using latin1
is a common workaround so that mysql doesn't know that it is handling the
characters. Forcing utf8 might lead to some unhappy users who suddenly find
not only their database must be changed but some of the characters used in
their language are nolonger allowed.

A thought.

Caleb

> 
>>>> Thanks
>>>> -Vincent
>>>>
>>>>> ________________________________
>>>>> From: Denis Gervalle<[email protected]>
>>>>> To: XWiki Developers<[email protected]>
>>>>> Sent: Tue, April 6, 2010 8:30:34 AM
>>>>> Subject: Re: [xwiki-devs] Simple patch to enable/preserve underscore
>>> chars in attachment file names
>>>>> On Tue, Apr 6, 2010 at 14:02, Guillaume Lerouge<[email protected]>
>>>   wrote:
>>>>>> Hi Milind,
>>>>>>
>>>>>> On Tue, Apr 6, 2010 at 1:23 AM, Milind Kamble<[email protected]>
>>>   wrote:
>>>>>>> Hi. I would like the dev community to evaluate this simple fix that
>>> will
>>>>>>> enable uploading of files with underscore chars in the file name when
>>>>>> users
>>>>>>> perform the attach action. Our user community is quite impressed about
>>>>>> the
>>>>>>> refreshing ease of use and the power, flexibility in their
>>> collaboration
>>>>>>> work flow made possible by XE. They would like to escape the tyranny
>>> of
>>>>>>> Microsoft-MOSS as early as possible and the main roadblock to do so is
>>>>>> the
>>>>>>> stripping of space and underscores from file names which were created
>>> in
>>>>>> a
>>>>>>> MS-Office centric environment.
>>>>>>>
>>>>>> I can't do much about your underscore problem (though I promise I'll
>>> poke
>>>>>> the developer sitting right next to me so that he looks at it).
>>>>>>
>>>>> I was already aware of this issue, and I have had similar problemqs with
>>>>> attachment, not only with "_", but also with accentuated chars etc...
>>>>> Restriction on attachment names will be easier to be changed when the
>>> new
>>>
>>>>> model model using references will be fully in place, since attachment
>>> names
>>>>> are currently used as reference for attachments. Be sure I will take
>>> care to
>>>>> have it improve.
> 
> 

_______________________________________________
devs mailing list
[email protected]
http://lists.xwiki.org/mailman/listinfo/devs

Reply via email to