Non-ASCII identifiers (was: Re: [fpc-devel] Forwarded message about FPC status)
Martin Schreiber wrote: On Monday 24 December 2012 10:23:00 Sven Barth wrote: As I already wrote there are currently no plans to change that FPC supports only ASCII identifiers. I don't think we can trust on that. I hoped that FPC will not use cpstrnew too. So if somebody implements non ASCII identifiers because he needs a second source Delphi compiler it will be merged because the addition does not break existing code. I assume utf-8 identifiers would not be very difficult to do in compiler. But what will the rule be as to whether something's a valid identifier? Will it have to start with something known to be a letter, or something not known to be a digit or reserved character? -- Mark Morgan Lloyd markMLl .AT. telemetry.co .DOT. uk [Opinions above are the author's, not those of his employers or colleagues] ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: Non-ASCII identifiers (was: Re: [fpc-devel] Forwarded message about FPC status)
In our previous episode, Mark Morgan Lloyd said: too. So if somebody implements non ASCII identifiers because he needs a second source Delphi compiler it will be merged because the addition does not break existing code. I assume utf-8 identifiers would not be very difficult to do in compiler. But what will the rule be as to whether something's a valid identifier? Will it have to start with something known to be a letter, or something not known to be a digit or reserved character? Afaik the latter. You specify what is not allowed rather than which are allowed. But sourcecode edited on multiple platforms might be a problem, (e.g. ligatures, denormalization and other forms of slightly different characters), this could lead to making the comparison of identifiers expensive, which is what you don't want in a compiler. But I don't know how big that problem would be. Maybe it is negiable. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: Non-ASCII identifiers (was: Re: [fpc-devel] Forwarded message about FPC status)
On 24/12/2012 12:17, Marco van de Voort wrote: In our previous episode, Mark Morgan Lloyd said: too. So if somebody implements non ASCII identifiers because he needs a second source Delphi compiler it will be merged because the addition does not break existing code. I assume utf-8 identifiers would not be very difficult to do in compiler. But what will the rule be as to whether something's a valid identifier? Will it have to start with something known to be a letter, or something not known to be a digit or reserved character? Afaik the latter. You specify what is not allowed rather than which are allowed. Hm that makes it easy to have an incomplete list, that could later become a problem half-width spaces etc..., control chars (RTL/LTR...), currently unused codepoints (that could become anything in future...) ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: Non-ASCII identifiers (was: Re: [fpc-devel] Forwarded message about FPC status)
In our previous episode, Martin said: Hm that makes it easy to have an incomplete list, that could later become a problem half-width spaces etc..., control chars (RTL/LTR...), currently unused codepoints (that could become anything in future...) Still shorter than what is allowed. And I'm pretty sure Delphi does it that way, it was said during the Delphi 2009 sales presentation iirc. (when they demonstrated unicode in source and object inspector) ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: Non-ASCII identifiers (was: Re: [fpc-devel] Forwarded message about FPC status)
On 24/12/12 11:22, Martin wrote: half-width spaces etc..., control chars (RTL/LTR...), currently unused codepoints (that could become anything in future...) As Marco said, the list will be smaller than the allowed list. Also the Unicode specification defines blocks or categories for code points, so that could be used too. eg: Take a look at TCharacter.IsNumeric(..) implementation. It doesn't do actual code point comparisons, it simply checks the Unicode category of the passed in code point. Regards, - Graeme - ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: Non-ASCII identifiers (was: Re: [fpc-devel] Forwarded message about FPC status)
On 24/12/2012 13:05, Marco van de Voort wrote: In our previous episode, Martin said: Hm that makes it easy to have an incomplete list, that could later become a problem half-width spaces etc..., control chars (RTL/LTR...), currently unused codepoints (that could become anything in future...) Still shorter than what is allowed. And I'm pretty sure Delphi does it that way, it was said during the Delphi 2009 sales presentation iirc. (when they demonstrated unicode in source and object inspector) If you use a utf8 lib, you can check attributes of each codepoint.. though that may be slower ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel