Re: String Literals
On Tuesday, 3 May 2022 at 17:21:47 UTC, JG wrote: Hi, The specification of string literals has either some errors or I don't understand what is meant by a Character. [...] Which to me means that e.g. r""" should be a WysiwygString, which the compiler thinks is not (not surprisingly). Am I misunderstanding something? The rule is not correct but the implementation in the lexer is. That's a valid issue for dlang.org
String Literals
Hi, The specification of string literals has either some errors or I don't understand what is meant by a Character. For instance we have: WysiwygString: r" WysiwygCharacters_opt " StringPostfix_opt WysiwygCharacters: WysiwygCharacter WysiwygCharacter WysiwygCharacters WysiwygCharacter: Character EndOfLine Character: any Unicode character Which to me means that e.g. r""" should be a WysiwygString, which the compiler thinks is not (not surprisingly). Am I misunderstanding something?
Re: == comparison of string literals, and their usage
Le 07/04/2019 à 14:23, bauss via Digitalmars-d-learn a écrit : On Saturday, 6 April 2019 at 19:47:14 UTC, lithium iodate wrote: On Saturday, 6 April 2019 at 15:35:22 UTC, diniz wrote: So, I still could store and use and compare string pointers myself [1], and get valid results, meaning: pointer equality implies (literal) string equality. Or am I wrong? The point is, the parser, operating on an array of prescanned lexemes, will constantly check whether a valid lexeme is present simply by checking the lexeme "class". I don't want that to be a real string comp, too expesensive and for no gain. [1] As in the second comp of your example: void main() { auto c2 = "one" == "two"; auto c1 = "one".ptr is "two".ptr; } Not quite. D-strings strictly consist of pointer *and* length, so you need to compare the .length properties as well to correctly conclude that the strings equal. You can concisely do that in one go by simply `is` comparing the array references as in string a = "hello"; string b = a; assert(a is b); assert(a[] is b[]); Of course, if the strings are never sliced, you can just compare the pointers and be done, just make sure to document how it operates. Depending on the circumstances I'd throw in some asserts that do actual strings comparison to verify the program logic. To add onto this. Here is an example why it's important to compare the length as well: string a = "hello"; string b = a[0 .. 3]; assert(a.ptr == b.ptr); assert(a.length != b.length); Thank you! Very clear :-). -- diniz {la vita e estranj}
Re: == comparison of string literals, and their usage
On Saturday, 6 April 2019 at 19:47:14 UTC, lithium iodate wrote: On Saturday, 6 April 2019 at 15:35:22 UTC, diniz wrote: So, I still could store and use and compare string pointers myself [1], and get valid results, meaning: pointer equality implies (literal) string equality. Or am I wrong? The point is, the parser, operating on an array of prescanned lexemes, will constantly check whether a valid lexeme is present simply by checking the lexeme "class". I don't want that to be a real string comp, too expesensive and for no gain. [1] As in the second comp of your example: void main() { auto c2 = "one" == "two"; auto c1 = "one".ptr is "two".ptr; } Not quite. D-strings strictly consist of pointer *and* length, so you need to compare the .length properties as well to correctly conclude that the strings equal. You can concisely do that in one go by simply `is` comparing the array references as in string a = "hello"; string b = a; assert(a is b); assert(a[] is b[]); Of course, if the strings are never sliced, you can just compare the pointers and be done, just make sure to document how it operates. Depending on the circumstances I'd throw in some asserts that do actual strings comparison to verify the program logic. To add onto this. Here is an example why it's important to compare the length as well: string a = "hello"; string b = a[0 .. 3]; assert(a.ptr == b.ptr); assert(a.length != b.length);
Re: == comparison of string literals, and their usage
Le 06/04/2019 à 21:47, lithium iodate via Digitalmars-d-learn a écrit : On Saturday, 6 April 2019 at 15:35:22 UTC, diniz wrote: So, I still could store and use and compare string pointers myself [1], and get valid results, meaning: pointer equality implies (literal) string equality. Or am I wrong? The point is, the parser, operating on an array of prescanned lexemes, will constantly check whether a valid lexeme is present simply by checking the lexeme "class". I don't want that to be a real string comp, too expesensive and for no gain. [1] As in the second comp of your example: void main() { auto c2 = "one" == "two"; auto c1 = "one".ptr is "two".ptr; } Not quite. D-strings strictly consist of pointer *and* length, so you need to compare the .length properties as well to correctly conclude that the strings equal. You can concisely do that in one go by simply `is` comparing the array references as in string a = "hello"; string b = a; assert(a is b); assert(a[] is b[]); Of course, if the strings are never sliced, you can just compare the pointers and be done, just make sure to document how it operates. Depending on the circumstances I'd throw in some asserts that do actual strings comparison to verify the program logic. Thank you very much! And yes, properly documenting is also important to me. -- diniz {la vita e estranj}
Re: == comparison of string literals, and their usage
On Saturday, 6 April 2019 at 15:35:22 UTC, diniz wrote: So, I still could store and use and compare string pointers myself [1], and get valid results, meaning: pointer equality implies (literal) string equality. Or am I wrong? The point is, the parser, operating on an array of prescanned lexemes, will constantly check whether a valid lexeme is present simply by checking the lexeme "class". I don't want that to be a real string comp, too expesensive and for no gain. [1] As in the second comp of your example: void main() { auto c2 = "one" == "two"; auto c1 = "one".ptr is "two".ptr; } Not quite. D-strings strictly consist of pointer *and* length, so you need to compare the .length properties as well to correctly conclude that the strings equal. You can concisely do that in one go by simply `is` comparing the array references as in string a = "hello"; string b = a; assert(a is b); assert(a[] is b[]); Of course, if the strings are never sliced, you can just compare the pointers and be done, just make sure to document how it operates. Depending on the circumstances I'd throw in some asserts that do actual strings comparison to verify the program logic.
Re: == comparison of string literals, and their usage
Le 06/04/2019 à 16:07, AltFunction1 via Digitalmars-d-learn a écrit : On Friday, 5 April 2019 at 14:49:50 UTC, diniz wrote: Hello, Since literal strings are interned (and immutable), can I count on the fact that they are compared (==) by pointer? No. "==" performs a full array comparison and "is" is apparently simplified at compile time. In the compiler there's no notion of string literal as a special expression. It's always a StringExp. See https://d.godbolt.org/z/K5R6u6. However you're right to say that literal are not duplicated. Thank you very much. So, I still could store and use and compare string pointers myself [1], and get valid results, meaning: pointer equality implies (literal) string equality. Or am I wrong? The point is, the parser, operating on an array of prescanned lexemes, will constantly check whether a valid lexeme is present simply by checking the lexeme "class". I don't want that to be a real string comp, too expesensive and for no gain. [1] As in the second comp of your example: void main() { auto c2 = "one" == "two"; auto c1 = "one".ptr is "two".ptr; } -- diniz {la vita e estranj}
Re: == comparison of string literals, and their usage
On Friday, 5 April 2019 at 14:49:50 UTC, diniz wrote: Hello, Since literal strings are interned (and immutable), can I count on the fact that they are compared (==) by pointer? No. "==" performs a full array comparison and "is" is apparently simplified at compile time. In the compiler there's no notion of string literal as a special expression. It's always a StringExp. See https://d.godbolt.org/z/K5R6u6. However you're right to say that literal are not duplicated.
== comparison of string literals, and their usage
Hello, Since literal strings are interned (and immutable), can I count on the fact that they are compared (==) by pointer? Context: The use case is a custom lexer for a custom language. I initially wanted to represent lexeme classes by a big enum 'LexClass'. However, this makes me write 3 times all constant lexemes (keywords and keysigns): 1- in the enum of lexeme classes 2- in an array of constants (for the contant-scanning func) 3- in an associative array mapping constants to their classes However, if literal strings are compared by equality, then they are kinds of Scheme or Ruby symbols: read enum values representing *cases*, which is exactly what I need. I would thus use the constants' strings themselves as lexeme classes... the parser would not be slown down. What do you think? -- diniz {la vita e estranj}
Re: Problems with string literals and etc.c.odbc.sql functions
On Saturday, 19 December 2015 at 14:16:36 UTC, anonymous wrote: On 19.12.2015 14:20, Marc Schütz wrote: As this is going to be passed to a C function, it would need to be zero-terminated. `.dup` doesn't do this, he'd have to use `std.string.toStringz` instead. However, that function returns a `immutable(char)*`, which would have to be cast again :-( Ouch, I totally missed that. Looks like we don't have a nice way to do this then? I guess so. Theoretically, we could change `toStringz()` to return `char*`; if its result is unique (we would have to change the implementation to always make a copy), it should then be implicitly convertible to `immutable(char)*`. But that would break code, because it could have been used like `auto s = "xyz".toStringz;`, where `s` would then have a different type. So, there'd need to be an additional function, `toMutableStringz`.
Re: Problems with string literals and etc.c.odbc.sql functions
On Saturday, 19 December 2015 at 17:30:02 UTC, Kagamin wrote: On Saturday, 19 December 2015 at 13:20:03 UTC, Marc Schütz wrote: As this is going to be passed to a C function No, ODBC API is designed with multilingual capability in mind, it doesn't rely on null terminated strings heavily: all string arguments support length specification. Nice, then SQL_NTS means "null terminated string" and should be replaced by `mystring.length`...
Re: Problems with string literals and etc.c.odbc.sql functions
Well, ISO 9075-3 doesn't use const qualifiers, but uses IN/OUT qualifiers instead, e.g. ExecDirect function is declared as: ExecDirect ( StatementHandle IN INTEGER, StatementText IN CHARACTER(L), TextLength IN INTEGER ) RETURNS SMALLINT And in C header: SQLRETURN SQLExecDirect(SQLHSTMT StatementHandle, SQLCHAR *StatementText, SQLINTEGER TextLength);
Re: Problems with string literals and etc.c.odbc.sql functions
On Saturday, 19 December 2015 at 13:20:03 UTC, Marc Schütz wrote: As this is going to be passed to a C function No, ODBC API is designed with multilingual capability in mind, it doesn't rely on null terminated strings heavily: all string arguments support length specification.
Re: Problems with string literals and etc.c.odbc.sql functions
On Friday, 18 December 2015 at 22:35:04 UTC, anonymous wrote: If the parameter is really not const, i.e. the function may mutate the argument, then the cast is not ok. You can use `.dup.ptr` instead to get a proper char* from a string. As this is going to be passed to a C function, it would need to be zero-terminated. `.dup` doesn't do this, he'd have to use `std.string.toStringz` instead. However, that function returns a `immutable(char)*`, which would have to be cast again :-(
Re: Problems with string literals and etc.c.odbc.sql functions
On 19.12.2015 14:20, Marc Schütz wrote: As this is going to be passed to a C function, it would need to be zero-terminated. `.dup` doesn't do this, he'd have to use `std.string.toStringz` instead. However, that function returns a `immutable(char)*`, which would have to be cast again :-( Ouch, I totally missed that. Looks like we don't have a nice way to do this then?
Re: Problems with string literals and etc.c.odbc.sql functions
On Friday, 18 December 2015 at 22:18:34 UTC, Adam D. Ruppe wrote: That's what the examples on MSDN do too though, a cast. At first I thought the binding was missing a const, but the ODBC docs don't specify it as const either and cast. The ODBC functions also have a size parameter for string parameters. You can set it to SQL_NTS to use the 0 terminated C standard. Might justify on why it's char* instead of const char*. It looks like it's alright, then. Just one implementation detail I didn't notice before.
Re: Problems with string literals and etc.c.odbc.sql functions
On Friday, 18 December 2015 at 22:35:04 UTC, anonymous wrote: If the parameter is de facto const, then the cast is ok. Though, maybe it should be marked const then. I'm just worried about casts because I read somewhere that strings start with the number of characters inside them (probably in slices documentation), and not with actual content (though string literals probably act different in this case). Documentation on casts say: Casting a pointer type to and from a class type is done as a type paint (i.e. a reinterpret cast). Reinterpretation is rather dangerous if strings are stored differently. But this test gives me a good hope on this case: writeln(*(cast(char*) "Test")); Casting is what I'm going with. .dup.ptr is less clear in this case.
Problems with string literals and etc.c.odbc.sql functions
By the use of this tutorial (http://www.easysoft.com/developer/languages/c/odbc_tutorial.html), I thought it would be very straightforward to use etc.c.odbc.sqlext and etc.c.odbc.sql to create a simple odbc application. But as soon as I started, I noticed a quirk: SQLRETURN ret; SQLHDBC dbc; ret = SQLDriverConnect(dbc, null, "DNS=*mydns*;", SQL_NTS, null, 0, null, SQL_DRIVER_COMPLETE); This gives me an error: function etc.c.odbc.sqlext.SQLDriverConnect (void* hdbc, void* hwnd, char* szConnStrIn, short cbConnStrIn, char* szConnStrOut, short cbConnStrOutMax, short* pcbConnStrOut, ushort fDriverCompletion) is not callable using argument types (void*, typeof(null), string, int, typeof(null), int, typeof(null), int) After some casting, I found out it's all related to the string literal. I thought it would work straight off the box, after reading the "Interfacing to C" spec (http://dlang.org/spec/interfaceToC.html). When I remove the string literal and replace it with null, it compiles. .ptr and .toStringz both give immutable char* references, and don't work. A "cast(char *)"DNS=*maydns*;"" works, but it feels a lot like a hack that will not work in the long run.
Re: Problems with string literals and etc.c.odbc.sql functions
On 18.12.2015 23:14, Fer22f wrote: By the use of this tutorial (http://www.easysoft.com/developer/languages/c/odbc_tutorial.html), I thought it would be very straightforward to use etc.c.odbc.sqlext and etc.c.odbc.sql to create a simple odbc application. But as soon as I started, I noticed a quirk: SQLRETURN ret; SQLHDBC dbc; ret = SQLDriverConnect(dbc, null, "DNS=*mydns*;", SQL_NTS, null, 0, null, SQL_DRIVER_COMPLETE); This gives me an error: function etc.c.odbc.sqlext.SQLDriverConnect (void* hdbc, void* hwnd, char* szConnStrIn, short cbConnStrIn, char* szConnStrOut, short cbConnStrOutMax, short* pcbConnStrOut, ushort fDriverCompletion) is not callable using argument types (void*, typeof(null), string, int, typeof(null), int, typeof(null), int) After some casting, I found out it's all related to the string literal. I thought it would work straight off the box, after reading the "Interfacing to C" spec (http://dlang.org/spec/interfaceToC.html). When I remove the string literal and replace it with null, it compiles. .ptr and .toStringz both give immutable char* references, and don't work. A "cast(char *)"DNS=*maydns*;"" works, but it feels a lot like a hack that will not work in the long run. If the parameter is de facto const, then the cast is ok. Though, maybe it should be marked const then. If the parameter is really not const, i.e. the function may mutate the argument, then the cast is not ok. You can use `.dup.ptr` instead to get a proper char* from a string. Also, remember that D's GC doesn't scan foreign memory. So if the function keeps the string around, and that's the only reference, then the GC would collect it. The function probably doesn't keep the string around, though.
Re: Problems with string literals and etc.c.odbc.sql functions
On Friday, 18 December 2015 at 22:14:04 UTC, Fer22f wrote: When I remove the string literal and replace it with null, it compiles. .ptr and .toStringz both give immutable char* references, and don't work. A "cast(char *)"DNS=*maydns*;"" works, but it feels a lot like a hack that will not work in the long run. That's what the examples on MSDN do too though, a cast. At first I thought the binding was missing a const, but the ODBC docs don't specify it as const either and cast. So it is kinda weird but I think right according to docs. However, I'd argue we should make it const if it can be...
Re: Problems with string literals and etc.c.odbc.sql functions
On 19.12.2015 01:06, Fer22f wrote: Documentation on casts say: Casting a pointer type to and from a class type is done as a type paint (i.e. a reinterpret cast). That sentence doesn't apply. string is not a class, it's an alias for immutable(char)[], i.e. it's an array. Reinterpretation is rather dangerous if strings are stored differently. But this test gives me a good hope on this case: writeln(*(cast(char*) "Test")); Casting is what I'm going with. .dup.ptr is less clear in this case. Correctness beats clarity. I'd like to advise you not to use casts unless you know for sure that they're safe. Here, you need to know what a string is exactly, what the cast does exactly, and what exactly the called function does with the pointer.
Re: Passing string literals to C
On Wednesday, 31 December 2014 at 11:19:36 UTC, Laeeth Isharc wrote: Argh - no way to edit. What's best practice here? D strings are not null-terminated. === cpling.c char* cpling(char *s) { s[0]='!'; return s; } === dcaller.d extern(C) char* cpling(char* s); void callC() { writefln(%s,fromStringz(cpling(hello\0))); } or void callC() { writefln(%s,fromStringz(cpling(toStringz(hello; } === am I missing a better way to do this? To call a C function you can either use string literals which are always null terminated or use std.string.toStringz (when using string variables) to add the null and return a char*. To convert from char* (from a C function return value) to a D string use std.conv.to!(string).
Re: Passing string literals to C
Argh - no way to edit. What's best practice here? D strings are not null-terminated. === cpling.c char* cpling(char *s) { s[0]='!'; return s; } === dcaller.d extern(C) char* cpling(char* s); void callC() { writefln(%s,fromStringz(cpling(hello\0))); } or void callC() { writefln(%s,fromStringz(cpling(toStringz(hello; } === am I missing a better way to do this?
Passing string literals to C
What's best practice here? D strings are not null-terminated. char* cpling(char *s) { So toString(This i
Re: Passing string literals to C
V Wed, 31 Dec 2014 11:19:35 + Laeeth Isharc via Digitalmars-d-learn digitalmars-d-learn@puremagic.com napsáno: Argh - no way to edit. What's best practice here? D strings are not null-terminated. === cpling.c char* cpling(char *s) { s[0]='!'; return s; } === dcaller.d extern(C) char* cpling(char* s); void callC() { writefln(%s,fromStringz(cpling(hello\0))); } or void callC() { writefln(%s,fromStringz(cpling(toStringz(hello; } === am I missing a better way to do this? First I am not sure, but you do not need to call fromStringz in this case. Next in this example you even not need to call toStringz, because D string literals are null-terminated. But generally it is better to use toStringz when need pass D strings to C code. Important Note: When passing a char* to a C function, and the C function keeps it around for any reason, make sure that you keep a reference to it in your D code. Otherwise, it may go away during a garbage collection cycle and cause a nasty bug when the C code tries to use it.
Re: Passing string literals to C
On 12/31/2014 8:19 PM, Laeeth Isharc wrote: Argh - no way to edit. What's best practice here? D strings are not null-terminated. === cpling.c char* cpling(char *s) { s[0]='!'; return s; } === dcaller.d extern(C) char* cpling(char* s); void callC() { writefln(%s,fromStringz(cpling(hello\0))); } or void callC() { writefln(%s,fromStringz(cpling(toStringz(hello; } === am I missing a better way to do this? String literals are always null-terminated. You can typically pass them as-is and D will do the right thing (you can also pass MyStr.ptr if you want). Use toStringz when the string came from an external source (read from a file, passed into a function and so on), since you can't be sure if it was a literal or not. toStringz will recognize if it has a null-terminator and will not do anything if it does. Also, you should make sure to consider std.conv.to on any C strings returned into D if you are going to keep them around. fromStringz only creates a slice, which is fine for how you use it here, but could get you into trouble if you aren't careful. std.conv.to will allocate a new string.
Re: Passing string literals to C
On Wednesday, 31 December 2014 at 11:45:33 UTC, Mike Parker wrote: On 12/31/2014 8:19 PM, Laeeth Isharc wrote: Argh - no way to edit. What's best practice here? D strings are not null-terminated. === cpling.c char* cpling(char *s) { s[0]='!'; return s; } === dcaller.d extern(C) char* cpling(char* s); void callC() { writefln(%s,fromStringz(cpling(hello\0))); } or void callC() { writefln(%s,fromStringz(cpling(toStringz(hello; } === am I missing a better way to do this? String literals are always null-terminated. You can typically pass them as-is and D will do the right thing (you can also pass MyStr.ptr if you want). String literals can implicitly convert to const(char)* or immutable(char)*. Neat. It doesn't appear to apply to array literals in general though...
Re: Passing string literals to C
On Wednesday, 31 December 2014 at 12:25:45 UTC, John Colvin wrote: String literals can implicitly convert to const(char)* or immutable(char)*. Neat. It doesn't appear to apply to array literals in general though... I believe this is a special case specifically for strings added for convenience when interfacing with C. Walter has said that he is strongly against arrays decaying to points a la C, and D generally does not support it save for this special case.
Re: Passing string literals to C
Thanks for the help. Laeeth
Re: Multiline String Literals without linefeeds?
John Carter: is there a similar mechanism in D? Or should I do... string foo = long string without linefeeds ; Genrally you should do: string foo = long ~ string ~ without ~ linefeeds; See also: http://d.puremagic.com/issues/show_bug.cgi?id=3827 You could also write a string with newlines and then remove them at compile-time with string functions. Bye, bearophile
Re: Multiline String Literals without linefeeds?
On Monday, 23 September 2013 at 09:42:59 UTC, bearophile wrote: John Carter: is there a similar mechanism in D? Or should I do... string foo = long string without linefeeds ; Genrally you should do: string foo = long ~ string ~ without ~ linefeeds; See also: http://d.puremagic.com/issues/show_bug.cgi?id=3827 You could also write a string with newlines and then remove them at compile-time with string functions. Bye, bearophile Isn't some string replaced with somestring early on?
Re: Multiline String Literals without linefeeds?
simendsjo: Isn't some string replaced with somestring early on? Yes, unfortunately. And it's something Walter agreed with me to kill, but nothing has happened... Bye, bearophile
Re: Multiline String Literals without linefeeds?
On Monday, 23 September 2013 at 11:10:07 UTC, bearophile wrote: simendsjo: Isn't some string replaced with somestring early on? Yes, unfortunately. And it's something Walter agreed with me to kill, but nothing has happened... Bye, bearophile Rationale / link to discussion? I use it extensively.
Re: Multiline String Literals without linefeeds?
Dicebot: Rationale / link to discussion? I use it extensively. http://d.puremagic.com/issues/show_bug.cgi?id=3827 Bye, bearophile
Multiline String Literals without linefeeds?
In C/C++ in the presence of the preprocessor a string char foo[] = \ long\ string\ without\ linefeeds\ ; Is translated by the preprocessor to char foo[] = longstringwithoutlinefeeds; is there a similar mechanism in D? Or should I do... string foo = long string without linefeeds ;
string literals
What's the reason that the string literal is a dynamic array, not a static? So sometimes it is not possible to get string length compile time: void foo(T: E[N], E, size_t N)(auto ref T data) { pragma(msg, static); pragma(msg, data.length); } void foo(T: E[], E)(auto ref T data) { pragma(msg, dynamic); pragma(msg, data.length); // Error: variable data // cannot be read at compile time } ... foo(test);
Re: string literals
Jack Applegame: What's the reason that the string literal is a dynamic array, not a static? Originally it was a fixed sized array. But in most cases you want a dynamic array. Rust language forces you to specify where to allocate the string literal with a symbol before the string, as ~hello. In D they have chosen a simpler solution, defaulting to dynamic. This enhancement is meant to lessen the problem a little: http://d.puremagic.com/issues/show_bug.cgi?id=481 Bye, bearophile
Re: string literals
On Friday, May 31, 2013 16:20:44 Jack Applegame wrote: What's the reason that the string literal is a dynamic array, not a static? Would you really want to end up with a copy of a string literal every time you used it? The fact that they're immutable and can be passed around without ever being copied is a definite efficiency boost for handling string literals (and a lot of string handling involves string literals). Making them static arrays wouldn't buy us anything and would cost us a lot. So sometimes it is not possible to get string length compile time: void foo(T: E[N], E, size_t N)(auto ref T data) { pragma(msg, static); pragma(msg, data.length); } void foo(T: E[], E)(auto ref T data) { pragma(msg, dynamic); pragma(msg, data.length); // Error: variable data // cannot be read at compile time } ... foo(test); You can't get the length there because data is not known at compile time. The variable must be known at compile time for it to work with pragma. The fact that it's a string is irrelevant, and making it a static array woludn't help any. If data were a template argument, it would work, but it's a funciton argument, so it won't. - Jonathan M Davis
Re: string literals
On Friday, 31 May 2013 at 15:35:51 UTC, Jonathan M Davis wrote: The fact that it's a string is irrelevant, and making it a static array woludn't help any. If data were a template argument, it would work, but it's a funciton argument, so it won't. If to pass reference to static array as function argument, pragma will work, But you are right. I have to find another way. Perhaps passing string as template alias.
Unicode encodings and string literals
Hi, I am playing with samples from Petzold's Programming Windows converted by Andrej Mitrovic. Many thanks, Andrej. :-) My question is about string conversion. There is a function in virtually every sample named toUTF16z, which if I understand properly, converts string to UTF-16, so that they can be sent to various Windows API functions. But string literals, for example in MessageBox, are fine, no conversion is needed. I don't understand the magic, what is converted, and when? If some variable was used e.g. appName.toUTF16z, and not Error.toUTF16z
Re: Unicode encodings and string literals
On 2012-10-08 10:06, Lubos Pintes wrote: Hi, I am playing with samples from Petzold's Programming Windows converted by Andrej Mitrovic. Many thanks, Andrej. :-) My question is about string conversion. There is a function in virtually every sample named toUTF16z, which if I understand properly, converts string to UTF-16, so that they can be sent to various Windows API functions. But string literals, for example in MessageBox, are fine, no conversion is needed. I don't understand the magic, what is converted, and when? If some variable was used e.g. appName.toUTF16z, and not Error.toUTF16z Without looking at the code, this is my guess: The toUTF16z function converts a D string, of any Unicode encoding, to UTF-16 and converts that to a C string. String literals in D have a trailing null character \0 included making them compatible with functions expecting C strings. String variables on the other do not have the trailing null character and therefore needs a conversion. -- /Jacob Carlborg
Why aren't wide string literals zero-terminated just like strings?
Skip the rest of the code until you reach main: http://codepad.org/zPAgFnPX We have this notion that string *literals* are zero-terminated, which enables us to send them to C functions expecting zero-terminated char* strings. But the same doesn't apply to wide string literals, e.g. somestringw. If it did, its would save quite a bit of typing when calling WinAPI functions that expect wide strings, instead of having to call somestring.toUTF16z. So currently: immutable(char)[] literal implicitly convertible to const(char)* and char*. immutable(wchar)[] literal not implicitly convertible to const(wchar)* and wchar*.
Re: Why aren't wide string literals zero-terminated just like strings?
On Wed, 18 May 2011 16:57:37 -0400, Andrej Mitrovic n...@none.none wrote: Skip the rest of the code until you reach main: http://codepad.org/zPAgFnPX We have this notion that string *literals* are zero-terminated, which enables us to send them to C functions expecting zero-terminated char* strings. But the same doesn't apply to wide string literals, e.g. somestringw. Yes it does... steves@steve-laptop:~/testd$ cat teststringlit.d wstring ws = abcdew; steves@steve-laptop:~/testd$ ~/dmd-2.053/linux/bin32/dmd -c teststringlit.d steves@steve-laptop:~/testd$ ~/dmd-2.053/linux/bin32/obj2asm teststringlit.o .rodata segment db 061h,000h,062h,000h,063h,000h,064h,000h ;a.b.c.d. db 065h,000h,000h,000h ;e... .rodata ends If it did, its would save quite a bit of typing when calling WinAPI functions that expect wide strings, instead of having to call somestring.toUTF16z. So currently: immutable(char)[] literal implicitly convertible to const(char)* and char*. immutable(wchar)[] literal not implicitly convertible to const(wchar)* and wchar*. That doesn't make sense... hm... tried it out, definitely a bug. I get the error: teststringlit.d(7): Error: function teststringlit.foo (const(wchar)* widestr) is not callable using argument types (immutable(wchar)[]) teststringlit.d(7): Error: cannot implicitly convert expression (abcdew) of type immutable(wchar)[] to const(wchar)* A wstring literal should be able to be passed to a const(wchar)* parameter. So the literal *is* zero terminated, but the compiler isn't letting you pass it directly to a const(wchar)*. Please file with bugzilla. As a workaround, you should be able to do somestringw.ptr; -Steve
Re: Why aren't wide string literals zero-terminated just like strings?
Ah, I had the wrong assumption but it is a bug. Reported: http://d.puremagic.com/issues/show_bug.cgi?id=6032 And thanks for disassembling!
std.conv.parse of string literals
This is a small D2 program that uses parse: import std.conv: parse; void main() { parse!int(111); parse!int(111); } Gives the error: std.conv.ConvError: std.conv(1122): Can't convert value `' of type string base 2 to type int But a string literal isn't a lvalue. This seems all wrong. In Bugzilla there are related bugs about strings. Do you think this is worth another bugzilla entry? Bye and thank you, bearophile
Re: std.conv.parse of string literals
This might be related to that bug report you wrote where you could assign one string literal to another. bearophile Wrote: But a string literal isn't a lvalue. This seems all wrong.
Re: std.conv.parse of string literals
Andrej Mitrovic: This might be related to that bug report you wrote where you could assign one string literal to another. Right. And recently there's another similar bug report in Bugzilla. So I may add this case just to one of those bug reports. Bye, bearophile
Re: String literals have only one instance?
Rory Mcguire wrote: Are all string literals that have the same value initialized to the same address? void main() { string same() { return This; } assert(This is same()); assert(This is This); } Can this be relied upon? Interesting thanks guys. Was just curious about the speed of comparisons for string literals. Because I believe the string comparisons check if a string is another string first.
String literals have only one instance?
Are all string literals that have the same value initialized to the same address? void main() { string same() { return This; } assert(This is same()); assert(This is This); } Can this be relied upon?
Re: String literals have only one instance?
On 8/19/10, Rory Mcguire rjmcgu...@gm_no_ail.com wrote: Are all string literals that have the same value initialized to the same address? void main() { string same() { return This; } assert(This is same()); assert(This is This); } Can this be relied upon? Well, since in Windows at least, string literals can be concatenated to and whatnot, I very much doubt that there's any sharing involved. You can always check with the is operator though. If it reports true, then the two strings have the same instance. If it reports false, then they don't.
Re: String literals have only one instance?
Rory Mcguire: Are all string literals that have the same value initialized to the same address? ... Can this be relied upon? Probably a conforming D implementation is free to not give the same address to those. Bye, bearophile
Re: String literals have only one instance?
Jonathan Davis wrote: snip You can always check with the is operator though. If it reports true, then the two strings have the same instance. If it reports false, then they don't. I can't see how testing each string literal to see if it's the same instance as another can work. The OP's point is: Are identical string literals *guaranteed* to be the same instance? Regardless of implementation? Regardless of whether they're next to each other, in different modules or anything in between? Regardless of the phase of the moon? Stewart.
Re: String literals have only one instance?
On 19.08.2010 09:53, Rory Mcguire wrote: Are all string literals that have the same value initialized to the same address? void main() { string same() { return This; } assert(This is same()); assert(This is This); } Can this be relied upon? I don't think so. It might work now, as we only have static linking, but what happens if we have 2 independent shared libraries with the string This? Each library has to include the string because the libraries don't depend on each other, but as soon as a program uses both libraries there are 2 memory locations where the string could be. (I guess the linker won't do some magic to make these point at the same location. But I might be wrong.) -- Johannes Pfau
Re: String literals have only one instance?
Rory Mcguire rjmcgu...@gm_no_ail.com wrote: Are all string literals that have the same value initialized to the same address? void main() { string same() { return This; } assert(This is same()); assert(This is This); } Can this be relied upon? No. The same string in different object files may be different instances, as may of course those in dynamically linked libraries. I would think the optimizer feels free to move string literals around as it sees fit, and the spec does not anywhere state that the compiler should merge string literals. -- Simen
Re: String literals have only one instance?
Rory Mcguire Wrote: Are all string literals that have the same value initialized to the same address? void main() { string same() { return This; } assert(This is same()); assert(This is This); } Can this be relied upon? This should be expected but I wouldn't rely upon it.
Why are string literals zero-terminated?
Following this discussion on announce, I was wondering why string literals are zero-terminated. Or to re-formulate, why only string literals are zero-terminated. Why that inconsistency? What's the rationale behind it? Does anyone know? /Max Did you test with a string that was not in the code itself, e.g. from a config file? String literals are null terminated so you wouldn't have had an issue if all your strings were literals. Utf8 doesn't contain the string length, so you will run in to problems eventually. You have to use toStringz or your own null terminator. Unless of course you know that the function will always be taking string literals. But even then leaving something like that up to the programmer to remember is not exactly fool proof. Enjoy. ~Rory Hey again and thanks for the hint. I tried finding something on the DM page about string literals being null terminated and while the section about string literals didn't even mention it, it was said some place else. That explains why using string literals works even though I expected it to fail. It's indeed good to know and adding std.string.toStringz is probably a good idea ;). Thanks. Greetings, Max. sure, I must admit it is annoying when the same code can do different things just because of where the data came from. It would be easier to notice the bug if d never added a null on literals, but then there would also be a lot more usages of toStringz. I think if you want to test it you can do: auto s = blah; open(s[0..$].dup.ptr); // duplicating it should put it somewhere else // just slicing will not test When thinking about it, it makes sense to have string literals null terminated in order to have C functions work with them. However, I wonder about some stuff, for instance: string s = string; // is s == string\0 now? char[] c = cast(char[])s; // is c[6] == '\0' now? char* p = s.ptr; // is *(p+6) == '\0' now? I think use of the zero terminator should be consistent. Either make every string (and char[] for that matter) zero terminated in the underlying memory for backwards compatibility with C or leave it to the user in all cases. /Max perhaps the NULL is there because its there in the executable file? NULL is also often after a dynamic array simply because of d always initializing memory, and when you get an allocation often a larger amount is allocated which remains NULL.
Re: Why are string literals zero-terminated?
On Tue, 20 Jul 2010 13:26:56 +, Lars T. Kyllingstad wrote: On Tue, 20 Jul 2010 14:59:18 +0200, awishformore wrote: Following this discussion on announce, I was wondering why string literals are zero-terminated. Or to re-formulate, why only string literals are zero-terminated. Why that inconsistency? What's the rationale behind it? Does anyone know? So you can pass them to C functions. Note that even though string literals are zero terminated, the actual string (the array, that is) doesn't contain the zero character. It's located at the memory position immediately following the string. string s = hello; assert (s[$-1] != '\0'); // Last character of s is 'o', not '\0' assert (s.ptr[s.length] == '\0'); Why is it only so for literals? That is because the compiler can only guarantee the zero-termination of string literals. The memory following a string in general could contain anything. string s = getStringFromSomewhere(); // I have no idea where s is coming from, so I don't // know whether it is zero-terminated or not. Better // make sure. someCFunction(toStringz(s)); -Lars
Re: Why are string literals zero-terminated?
Am 20.07.2010 15:38, schrieb Lars T. Kyllingstad: On Tue, 20 Jul 2010 13:26:56 +, Lars T. Kyllingstad wrote: On Tue, 20 Jul 2010 14:59:18 +0200, awishformore wrote: Following this discussion on announce, I was wondering why string literals are zero-terminated. Or to re-formulate, why only string literals are zero-terminated. Why that inconsistency? What's the rationale behind it? Does anyone know? So you can pass them to C functions. Note that even though string literals are zero terminated, the actual string (the array, that is) doesn't contain the zero character. It's located at the memory position immediately following the string. string s = hello; assert (s[$-1] != '\0'); // Last character of s is 'o', not '\0' assert (s.ptr[s.length] == '\0'); Why is it only so for literals? That is because the compiler can only guarantee the zero-termination of string literals. The memory following a string in general could contain anything. string s = getStringFromSomewhere(); // I have no idea where s is coming from, so I don't // know whether it is zero-terminated or not. Better // make sure. someCFunction(toStringz(s)); -Lars Hey. Yes, that indeed makes a lot of sense. I didn't actually try those asserts because I'm currently not on a dev machine, but what you point out basically is the behaviour I was hoping for. Thanks for clearing this up. /Max