Re: Some questions about strings

2020-06-22 Thread Denis via Digitalmars-d-learn
On Monday, 22 June 2020 at 09:06:35 UTC, Jacob Carlborg wrote: String **literals** have a terminating null character, to help with integrating with C functions. But this null character will disappear when manipulating strings. You cannot assume that a function parameter of type `string`

Re: Some questions about strings

2020-06-22 Thread Jacob Carlborg via Digitalmars-d-learn
On Monday, 22 June 2020 at 04:08:10 UTC, Denis wrote: The terminating null character was one of the reasons I thought strings were different from char arrays. Now I know better. String **literals** have a terminating null character, to help with integrating with C functions. But this null

Re: Some questions about strings

2020-06-21 Thread Denis via Digitalmars-d-learn
On Monday, 22 June 2020 at 04:32:32 UTC, Mike Parker wrote: On Monday, 22 June 2020 at 04:08:10 UTC, Denis wrote: On Monday, 22 June 2020 at 03:31:17 UTC, Ali Çehreli wrote: : string is char[] wstring is wchar[] dstring is dchar[] Got it now. This is the critical piece I missed: I understand

Re: Some questions about strings

2020-06-21 Thread Mike Parker via Digitalmars-d-learn
On Monday, 22 June 2020 at 04:08:10 UTC, Denis wrote: On Monday, 22 June 2020 at 03:31:17 UTC, Ali Çehreli wrote: : string is char[] wstring is wchar[] dstring is dchar[] Got it now. This is the critical piece I missed: I understand the relations between the char types and the UTF encodings

Re: Some questions about strings

2020-06-21 Thread Denis via Digitalmars-d-learn
On Monday, 22 June 2020 at 03:31:17 UTC, Ali Çehreli wrote: : string is char[] wstring is wchar[] dstring is dchar[] Got it now. This is the critical piece I missed: I understand the relations between the char types and the UTF encodings (thanks to your book). But I mistakenly thought that

Re: Some questions about strings

2020-06-21 Thread Denis via Digitalmars-d-learn
On Monday, 22 June 2020 at 03:49:01 UTC, Adam D. Ruppe wrote: On Monday, 22 June 2020 at 03:43:58 UTC, Denis wrote: My code reads a UTF-8 encoded file into a buffer and validates, byte by byte, the UTF-8 encoding along with some additional validation. If I simply return the UTF-8 encoded

Re: Some questions about strings

2020-06-21 Thread Adam D. Ruppe via Digitalmars-d-learn
On Monday, 22 June 2020 at 03:43:58 UTC, Denis wrote: My code reads a UTF-8 encoded file into a buffer and validates, byte by byte, the UTF-8 encoding along with some additional validation. If I simply return the UTF-8 encoded string, there won't be another decoding/encoding done -- correct?

Re: Some questions about strings

2020-06-21 Thread Denis via Digitalmars-d-learn
On Monday, 22 June 2020 at 03:24:37 UTC, Adam D. Ruppe wrote: On Monday, 22 June 2020 at 03:17:54 UTC, Denis wrote: - First, is there any difference between string, wstring and dstring? Yes, they encode the same content differently in the bytes. If you cast it to ubyte[] and print that out

Re: Some questions about strings

2020-06-21 Thread Ali Çehreli via Digitalmars-d-learn
On 6/21/20 8:17 PM, Denis wrote:> I have a few questions about how strings are stored. > > - First, is there any difference between string, wstring and dstring? string is char[] wstring is wchar[] dstring is dchar[] char is 1 byte: UTF-8 code unit wchar is 2 bytes: UTF-16 code unit dchar is 4

Re: Some questions about strings

2020-06-21 Thread Adam D. Ruppe via Digitalmars-d-learn
On Monday, 22 June 2020 at 03:17:54 UTC, Denis wrote: - First, is there any difference between string, wstring and dstring? Yes, they encode the same content differently in the bytes. If you cast it to ubyte[] and print that out you can see the difference. - Are the characters of a string

Some questions about strings

2020-06-21 Thread Denis via Digitalmars-d-learn
I have a few questions about how strings are stored. - First, is there any difference between string, wstring and dstring? For example, a 3-byte Unicode character literal can be assigned to a variable of any of these types, then printed, etc, without errors. - Are the characters of a string