Re: Our Sister

2016-05-31 Thread Marco Leise via Digitalmars-d
Am Wed, 1 Jun 2016 01:06:36 +1000
schrieb Manu via Digitalmars-d :

> D loves templates, but templates aren't a given. Closed-source
> projects often can't have templates in the public API (ie, source
> should not be available), and this is my world.

Same effect for GPL code. Funny. (Template instantiations are
like statically linking in the open source code.)

-- 
Marco



Re: Our Sister

2016-05-31 Thread Manu via Digitalmars-d
On 31 May 2016 at 01:00, Marco Leise via Digitalmars-d
 wrote:
> Am Sat, 28 May 2016 14:15:45 +1000
> schrieb Manu via Digitalmars-d :
>
>> On 28 May 2016 at 10:16, Adam D. Ruppe via Digitalmars-d
>>  wrote:
>> > On Friday, 27 May 2016 at 21:51:59 UTC, Seb wrote:
>> >>
>> >> not if [] would be ref-counted too ;-)
>> >
>> >
>> > That would be kinda horrible. Right now, slicing is virtually free and
>> > compatible with all kinds of backing schemes. If it became refcounted, 
>> > it'd:
>> >
>> > 1) have to keep a pointer to the refcount structure with the slice, adding
>> > memory cost
>>
>> This is only true for the owner. If we had 'scope', or something like
>> it (ie, borrowing in rust lingo), then the fat slice wouldn't need to
>> be passed around, it's only a burden on the top-level owner.
>> 'scope' is consistently rejected, but it solves so many long-standing
>> problems we have, and this reduction of 'fat'(/rc)-slices to normal
>> slices is a particularly important one.
>
> I second that thought. But I'd be ok with an unsafe slice and
> making sure myself, that I don't keep a reference around. A
> lot of functions only borrow data and can work on a naked
> pointer/ref/slice, while the owner(s) have the smart pointer.
> These can of course be converted to templates taking either
> char[] or RCStr, but I think borrowing is cleaner when the
> function in question doesn't care a bag of beans if the chars
> it works on were allocated on the GC heap or reference counted.

D loves templates, but templates aren't a given. Closed-source
projects often can't have templates in the public API (ie, source
should not be available), and this is my world.


Re: Our Sister

2016-05-31 Thread Nick Treleaven via Digitalmars-d

On Friday, 27 May 2016 at 21:25:50 UTC, Andrei Alexandrescu wrote:

On 05/27/2016 05:02 PM, Era Scarecrow wrote:
  With the current state of things, I'll just take your word 
on it.


Reasoning is simple - yes we could safely convert to 
const(char)[] but that means effectively all refcounting is 
lost for that string. So we can convert but in an explicit 
manner, e.g. str.toGCThisWillCompletelySuckMan. -- Andrei


We could have:

const(char)[] s = rcstr.stealSlice;

Which is null* if the refcount is > 1. rcstr would then be empty 
on success. In fact if with the RC DIP we guarantee the memory 
doesn't escape, stealSlice could return string.


*Or better, return an Option.


Re: Our Sister

2016-05-30 Thread Marco Leise via Digitalmars-d
Am Sat, 28 May 2016 14:15:45 +1000
schrieb Manu via Digitalmars-d :

> On 28 May 2016 at 10:16, Adam D. Ruppe via Digitalmars-d
>  wrote:
> > On Friday, 27 May 2016 at 21:51:59 UTC, Seb wrote:  
> >>
> >> not if [] would be ref-counted too ;-)  
> >
> >
> > That would be kinda horrible. Right now, slicing is virtually free and
> > compatible with all kinds of backing schemes. If it became refcounted, it'd:
> >
> > 1) have to keep a pointer to the refcount structure with the slice, adding
> > memory cost  
> 
> This is only true for the owner. If we had 'scope', or something like
> it (ie, borrowing in rust lingo), then the fat slice wouldn't need to
> be passed around, it's only a burden on the top-level owner.
> 'scope' is consistently rejected, but it solves so many long-standing
> problems we have, and this reduction of 'fat'(/rc)-slices to normal
> slices is a particularly important one.

I second that thought. But I'd be ok with an unsafe slice and
making sure myself, that I don't keep a reference around. A
lot of functions only borrow data and can work on a naked
pointer/ref/slice, while the owner(s) have the smart pointer.
These can of course be converted to templates taking either
char[] or RCStr, but I think borrowing is cleaner when the
function in question doesn't care a bag of beans if the chars
it works on were allocated on the GC heap or reference counted.

-- 
Marco



Re: Our Sister

2016-05-29 Thread Dicebot via Digitalmars-d
On 05/27/2016 01:17 AM, Seb wrote:
> Oh yes that's what I meant. Sorry for being so confusing.
> __Right__ is way more important than breakages. For that we have `dfix`.

Don't get overly excited. dfix will never be capable of automatic fixup
with such deep levels of semantic analysis required, this can only be
done by compiler itself (which is currently not designed for fixup kind
of tasks).


Re: Our Sister

2016-05-28 Thread Adam D. Ruppe via Digitalmars-d

On Saturday, 28 May 2016 at 04:15:45 UTC, Manu wrote:
This is only true for the owner. If we had 'scope', or 
something like
it (ie, borrowing in rust lingo), then the fat slice wouldn't 
need to

be passed around


Right, I agree - if we keep the slice just the way it is now, it 
all still works if you borrow correctly!


(BTW, I don't think we even need this to be strictly @safe, 
though it would be nice if it was tested, we could say @system 
getSlice and potentially change it to @safe later.)


Re: Our Sister

2016-05-28 Thread Marc Schütz via Digitalmars-d

On Saturday, 28 May 2016 at 04:28:16 UTC, Manu wrote:
On 27 May 2016 at 23:32, Andrei Alexandrescu via Digitalmars-d 
 wrote:

On 5/27/16 7:07 AM, Marc Schütz wrote:

It should _safely_ convert to `const(char)[]`.



That is not possible, sorry. -- Andrei


It should safely convert to 'scope const(char)[]', then we only 
need a fat-slice or like at the very top of the callstack...


I didn't want to mention the s-word ;-)


Re: Our Sister

2016-05-28 Thread Marc Schütz via Digitalmars-d

On Friday, 27 May 2016 at 13:32:30 UTC, Andrei Alexandrescu wrote:

On 5/27/16 7:07 AM, Marc Schütz wrote:
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu 
wrote:

RFC: what primitives should RCStr have?


It should _safely_ convert to `const(char)[]`.


That is not possible, sorry. -- Andrei


It is when DIP25 [1] is finally fully implemented (by that I mean 
including for slices and pointers etc., Walter told me at Dconf 
that this is going to happen), and the problem with aliasing 
references is solved (which needs to happen anyway for any 
reference counting to be safe).


[1] https://wiki.dlang.org/DIP25


Re: Our Sister

2016-05-28 Thread ZombineDev via Digitalmars-d

On Saturday, 28 May 2016 at 09:43:41 UTC, ZombineDev wrote:
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu 
wrote:
I've been working on RCStr (endearingly pronounced "Our 
Sister"), D's up-and-coming reference counted string type. The 
goals are:




RCStr may be an easier first step, but I think generic dynamic 
arrays are more interesting, because are more generally 
applicable and user types like move-only resources make them a 
more challenging problem to solve.


BTW, what happened to scope? Generally speaking, I'm not a fan 
of Rust, and I know that you think that D needs to 
differentiate, but I like their borrowing model for several 
reasons:
a) while not 100% safe and quite verbose, it offers enough 
improvements over @safe D to make it a worthwhile upgrade, if 
you don't care about any other language features
b) it's not that hard to grasp / almost natural for people 
familiar with C++11's copy (shared_ptr) and move (unique_ptr) 
semantics.
3) it's general enough that it can be applied to areas like 
iterator invalidation, thread synchronization and other logic 
bugs, like some third-party rust packages demonstrate.


I think that improving escape analysis with the scope attribute 
can go along way to shortening the gap between Rust and D in 
that area.


The other elephant(s) in the room are nested contexts like 
delegates, nested structs and some alias template parameter 
arguments. These are especially bad because the user has zero 
control over those GC allocations. Which makes some of D's key 
features unusable in @nogc contexts.





* Reference counted, shouldn't leak if all instances 
destroyed; even if not, use the GC as a last-resort 
reclamation mechanism.


* Entirely @safe.

* Support UTF 100% by means of RCStr!char, RCStr!wchar etc. 
but also raw manipulation and custom encodings via 
RCStr!ubyte, RCStr!ushort etc.


* Support several views of the same string, e.g. given s of 
type RCStr!char, it can be iterated byte-wise, code 
point-wise, code unit-wise etc. by using s.by!ubyte, 
s.by!char, s.by!dchar etc.


* Support const and immutable qualifiers for the character 
type.


* Work well with const and immutable when they qualify the 
entire RCStr type.


* Fast: use the small string optimization and various other 
layout and algorithms to make it a good choice for high 
performance strings


RFC: what primitives should RCStr have?


Thanks,

Andrei


0) (Prerequisite) Composition/interaction with language 
features/user types - RCStr in nested contexts (alias template 
parameters, delegates, nested structs/classes), array of 
RCStr-s, RCStr as a struct/class member, RCStr passed as 
(const) ref parameter, etc. should correctly increase/decrease 
ref count. This is also a prerequisite for safe RefCounted!T.
Action item: related compiler bugs should be prioritized. E.g. 
the RAII bug from
Shachar Shemesh's lightning talk - 
http://forum.dlang.org/post/n8algm$qra$1...@digitalmars.com.

See also:
https://issues.dlang.org/buglist.cgi?quicksearch=raii_id=208631
https://issues.dlang.org/buglist.cgi?quicksearch=destructor_id=208632
(not everything in those lists is related but there are some 
nasty ones, like bad RVO codegen).


1) Safe slicing

2) shared overloads of member functions (e.g. for stuff like 
atomic incRef/decRef)


3) Concatenation (RCStr ~= RCStr ~ RCStr ~ char)

4) (Optional) Reserving (pre-allocating capacity) / shrinking. 
I labeled this feature request as optional, as it's not clear 
if RCStr is more like a container, or more like a slice/range.


5) Some sort of optimization for zero-terminated strings. Quite 
often one needs to interact with C APIs, which requires calling 
toStringz / toUTFz, which causes unnecessary allocations. It 
would be great if RCStr could efficiently handle this scenario.


6) !!! Not really a primitive, but we need to make sure that 
applying a chain of range transformations won't break ownership 
(e.g. leak or free prematurely).


7) Should be able to replace GC usage in transient ranges like 
e.g. File.byLine


8) Cheap initialization/assignment from string literals - 
should be roughly the same as either initializing a static 
character array (if the small string optimization is used) or 
just making it point to read-only memory in the data segment of 
the executable. It shouldn't try to write or free such memory. 
When initialized from a string literal, RCStr should also offer 
a null-terminating byte, provided that it points to the whole
If one wants to assign a string literal by overwriting parts of 
the already allocated storage, std.algorithm.mutation.copy 
should be used instead.


There may be other important primitives which I haven't thought 
of, but generally we should try to leverage std.algorithm, 
std.range, std.string and std.uni for them, via UFCS.


--

On a related note, I know that you want to use AffixAllocator 
for reference counting, and I think it's a great idea. I have 
one quest

Re: Our Sister

2016-05-28 Thread ZombineDev via Digitalmars-d
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu 
wrote:
I've been working on RCStr (endearingly pronounced "Our 
Sister"), D's up-and-coming reference counted string type. The 
goals are:




RCStr may be an easier first step, but I think generic dynamic 
arrays are more interesting, because are more generally 
applicable and user types like move-only resources make them a 
more challenging problem to solve.


BTW, what happened to scope? Generally speaking, I'm not a fan of 
Rust, and I know that you think that D needs to differentiate, 
but I like their borrowing model for several reasons:
a) while not 100% safe and quite verbose, it offers enough 
improvements over @safe D to make it a worthwhile upgrade, if you 
don't care about any other language features
b) it's not that hard to grasp / almost natural for people 
familiar with C++11's copy (shared_ptr) and move (unique_ptr) 
semantics.
3) it's general enough that it can be applied to areas like 
iterator invalidation, thread synchronization and other logic 
bugs, like some third-party rust packages demonstrate.


I think that improving escape analysis with the scope attribute 
can go along way to shortening the gap between Rust and D in that 
area.


The other elephant(s) in the room are nested contexts like 
delegates, nested structs and some alias template parameter 
arguments. These are especially bad because the user has zero 
control over those GC allocations. Which makes some of D's key 
features unusable in @nogc contexts.





* Reference counted, shouldn't leak if all instances destroyed; 
even if not, use the GC as a last-resort reclamation mechanism.


* Entirely @safe.

* Support UTF 100% by means of RCStr!char, RCStr!wchar etc. but 
also raw manipulation and custom encodings via RCStr!ubyte, 
RCStr!ushort etc.


* Support several views of the same string, e.g. given s of 
type RCStr!char, it can be iterated byte-wise, code point-wise, 
code unit-wise etc. by using s.by!ubyte, s.by!char, s.by!dchar 
etc.


* Support const and immutable qualifiers for the character type.

* Work well with const and immutable when they qualify the 
entire RCStr type.


* Fast: use the small string optimization and various other 
layout and algorithms to make it a good choice for high 
performance strings


RFC: what primitives should RCStr have?


Thanks,

Andrei


0) (Prerequisite) Composition/interaction with language 
features/user types - RCStr in nested contexts (alias template 
parameters, delegates, nested structs/classes), array of RCStr-s, 
RCStr as a struct/class member, RCStr passed as (const) ref 
parameter, etc. should correctly increase/decrease ref count. 
This is also a prerequisite for safe RefCounted!T.
Action item: related compiler bugs should be prioritized. E.g. 
the RAII bug from
Shachar Shemesh's lightning talk - 
http://forum.dlang.org/post/n8algm$qra$1...@digitalmars.com.

See also:
https://issues.dlang.org/buglist.cgi?quicksearch=raii_id=208631
https://issues.dlang.org/buglist.cgi?quicksearch=destructor_id=208632
(not everything in those lists is related but there are some 
nasty ones, like bad RVO codegen).


1) Safe slicing

2) shared overloads of member functions (e.g. for stuff like 
atomic incRef/decRef)


3) Concatenation (RCStr ~= RCStr ~ RCStr ~ char)

4) (Optional) Reserving (pre-allocating capacity) / shrinking. I 
labeled this feature request as optional, as it's not clear if 
RCStr is more like a container, or more like a slice/range.


5) Some sort of optimization for zero-terminated strings. Quite 
often one needs to interact with C APIs, which requires calling 
toStringz / toUTFz, which causes unnecessary allocations. It 
would be great if RCStr could efficiently handle this scenario.


6) !!! Not really a primitive, but we need to make sure that 
applying a chain of range transformations won't break ownership 
(e.g. leak or free prematurely).


7) Should be able to replace GC usage in transient ranges like 
e.g. File.byLine


8) Cheap initialization/assignment from string literals - should 
be roughly the same as either initializing a static character 
array (if the small string optimization is used) or just making 
it point to read-only memory in the data segment of the 
executable. It shouldn't try to write or free such memory. When 
initialized from a string literal, RCStr should also offer a 
null-terminating byte, provided that it points to the whole
If one wants to assign a string literal by overwriting parts of 
the already allocated storage, std.algorithm.mutation.copy should 
be used instead.


There may be other important primitives which I haven't thought 
of, but generally we should try to leverage std.algorithm, 
std.range, std.string and std.uni for them, via UFCS.


--

On a related note, I know that you want to use AffixAllocator for 
reference counting, and I think it's a great idea. I have one 
question, which wasn't answered during that discussion:


// Use a nig

Re: Our Sister

2016-05-27 Thread Bill Hicks via Digitalmars-d

On Saturday, 28 May 2016 at 04:31:22 UTC, Manu wrote:
On 27 May 2016 at 02:11, Andrei Alexandrescu via Digitalmars-d 
<digitalmars-d@puremagic.com> wrote:
I've been working on RCStr (endearingly pronounced "Our 
Sister"),


Ah, I totally skipped over this thread...

Wow... this really doesn't work in any accent I'm close to, but 
I can

hear it if I imagine you saying it ;)
If I said RCStr, it sounds like 'are'-'see'-strrr, but 'our 
sister'
would be 'hour'-sistə... isn't it strange that word recognition 
seems
to work pretty much reliably down a sliding scale until an 
arbitrary
point where it just drops off. There's not a lot of fuzzy area 
in the

middle.


The joke is revealed when you consider the fact that there are no 
females in this community, let alone sisters.  A kind of joke 
that's funny only to certain types of men.


Now let's get back to fixing things before the world finds out 
that D is full of holes.


Re: Our Sister

2016-05-27 Thread Manu via Digitalmars-d
On 27 May 2016 at 02:11, Andrei Alexandrescu via Digitalmars-d
<digitalmars-d@puremagic.com> wrote:
> I've been working on RCStr (endearingly pronounced "Our Sister"),

Ah, I totally skipped over this thread...

Wow... this really doesn't work in any accent I'm close to, but I can
hear it if I imagine you saying it ;)
If I said RCStr, it sounds like 'are'-'see'-strrr, but 'our sister'
would be 'hour'-sistə... isn't it strange that word recognition seems
to work pretty much reliably down a sliding scale until an arbitrary
point where it just drops off. There's not a lot of fuzzy area in the
middle.



Re: Our Sister

2016-05-27 Thread Manu via Digitalmars-d
On 27 May 2016 at 23:32, Andrei Alexandrescu via Digitalmars-d
 wrote:
> On 5/27/16 7:07 AM, Marc Schütz wrote:
>>
>> On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:
>>>
>>> RFC: what primitives should RCStr have?
>>
>>
>> It should _safely_ convert to `const(char)[]`.
>
>
> That is not possible, sorry. -- Andrei

It should safely convert to 'scope const(char)[]', then we only need a
fat-slice or like at the very top of the callstack...



Re: Our Sister

2016-05-27 Thread Manu via Digitalmars-d
On 28 May 2016 at 10:16, Adam D. Ruppe via Digitalmars-d
 wrote:
> On Friday, 27 May 2016 at 21:51:59 UTC, Seb wrote:
>>
>> not if [] would be ref-counted too ;-)
>
>
> That would be kinda horrible. Right now, slicing is virtually free and
> compatible with all kinds of backing schemes. If it became refcounted, it'd:
>
> 1) have to keep a pointer to the refcount structure with the slice, adding
> memory cost

This is only true for the owner. If we had 'scope', or something like
it (ie, borrowing in rust lingo), then the fat slice wouldn't need to
be passed around, it's only a burden on the top-level owner.
'scope' is consistently rejected, but it solves so many long-standing
problems we have, and this reduction of 'fat'(/rc)-slices to normal
slices is a particularly important one.


Re: Our Sister

2016-05-27 Thread Andrei Alexandrescu via Digitalmars-d

On 05/27/2016 06:09 PM, tsbockman wrote:

On Friday, 27 May 2016 at 21:25:50 UTC, Andrei Alexandrescu wrote:

On 05/27/2016 05:02 PM, Era Scarecrow wrote:

  With the current state of things, I'll just take your word on it.


Reasoning is simple - yes we could safely convert to const(char)[] but
that means effectively all refcounting is lost for that string. So we
can convert but in an explicit manner, e.g.
str.toGCThisWillCompletelySuckMan. -- Andrei


But conversions to scope const(char)[] could be made safe, right? (If
scope were ever fully implemented, that is.)


Yah, in principle. -- Andrei



Re: Our Sister

2016-05-27 Thread Adam D. Ruppe via Digitalmars-d

On Friday, 27 May 2016 at 22:09:48 UTC, tsbockman wrote:
But conversions to scope const(char)[] could be made safe, 
right? (If scope were ever fully implemented, that is.)


Indeed, and I really think we should spend more effort on making 
this work. Not as much as Rust spends on it, but a lil more than 
our current return ref dip.


Re: Our Sister

2016-05-27 Thread Adam D. Ruppe via Digitalmars-d

On Friday, 27 May 2016 at 21:51:59 UTC, Seb wrote:

not if [] would be ref-counted too ;-)


That would be kinda horrible. Right now, slicing is virtually 
free and compatible with all kinds of backing schemes. If it 
became refcounted, it'd:


1) have to keep a pointer to the refcount structure with the 
slice, adding memory cost


2) make assignments and slicing work through that refcount 
pointer, adding cpu cost


3) somehow need to know the appropriate freeing strategy, adding 
some kind of indirect call when refcount = 0, and would make 
creating a slice more tedious as you'd need to know this (meaning 
you also probably need to allocate this structure! no more free 
ptr[0 .. length] operation on malloc'd blocks.)



So I'd be pretty strongly against that.


Re: Our Sister

2016-05-27 Thread tsbockman via Digitalmars-d

On Friday, 27 May 2016 at 21:25:50 UTC, Andrei Alexandrescu wrote:

On 05/27/2016 05:02 PM, Era Scarecrow wrote:
  With the current state of things, I'll just take your word 
on it.


Reasoning is simple - yes we could safely convert to 
const(char)[] but that means effectively all refcounting is 
lost for that string. So we can convert but in an explicit 
manner, e.g. str.toGCThisWillCompletelySuckMan. -- Andrei


But conversions to scope const(char)[] could be made safe, right? 
(If scope were ever fully implemented, that is.)


Re: Our Sister

2016-05-27 Thread Seb via Digitalmars-d

On Friday, 27 May 2016 at 21:25:50 UTC, Andrei Alexandrescu wrote:

On 05/27/2016 05:02 PM, Era Scarecrow wrote:
  With the current state of things, I'll just take your word 
on it.


Reasoning is simple - yes we could safely convert to 
const(char)[] but that means effectively all refcounting is 
lost for that string. So we can convert but in an explicit 
manner, e.g. str.toGCThisWillCompletelySuckMan. -- Andrei


not if [] would be ref-counted too ;-)


Re: Our Sister

2016-05-27 Thread Andrei Alexandrescu via Digitalmars-d

On 05/27/2016 05:02 PM, Era Scarecrow wrote:

  With the current state of things, I'll just take your word on it.


Reasoning is simple - yes we could safely convert to const(char)[] but 
that means effectively all refcounting is lost for that string. So we 
can convert but in an explicit manner, e.g. 
str.toGCThisWillCompletelySuckMan. -- Andrei


Re: Our Sister

2016-05-27 Thread Era Scarecrow via Digitalmars-d

On Friday, 27 May 2016 at 13:32:30 UTC, Andrei Alexandrescu wrote:

On 5/27/16 7:07 AM, Marc Schütz wrote:
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu 
wrote:

RFC: what primitives should RCStr have?


It should _safely_ convert to `const(char)[]`.


That is not possible, sorry. -- Andrei


 I wonder if it could...

 For a while now I've wondered why there isn't an option to 
include flags to every type (for debugging)? The flags could 
relay a lot of information, like if a variable was originally 
immutable, const, shared, other? If it was originally allocated 
using the GC, malloc, C/C++/Other or stack. If it used a 
constructor, init, or not at all (= void)? Along with control 
options like where/when an assignment tries to happen, copies 
it's state (or it's variables with indirection), or printing an 
output each time it changes, etc.


 With the current state of things, I'll just take your word on it.


Re: Our Sister

2016-05-27 Thread Andrei Alexandrescu via Digitalmars-d

On 5/27/16 7:07 AM, Marc Schütz wrote:

On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:

RFC: what primitives should RCStr have?


It should _safely_ convert to `const(char)[]`.


That is not possible, sorry. -- Andrei


Re: Our Sister

2016-05-27 Thread Marc Schütz via Digitalmars-d
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu 
wrote:

RFC: what primitives should RCStr have?


It should _safely_ convert to `const(char)[]`.


Re: Our Sister

2016-05-27 Thread Nordlöw via Digitalmars-d

On Thursday, 26 May 2016 at 17:32:33 UTC, Jack Stouffer wrote:
*bikeshedding*: How about RCString, because the convention for 
D names is to be explicit most of the time.


+1


Re: Our Sister

2016-05-27 Thread Nordlöw via Digitalmars-d
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu 
wrote:
* Fast: use the small string optimization and various other 
layout and algorithms to make it a good choice for high 
performance strings


For inspiration see:

- Vladimir recommends `tempCString`
- Nikolay has https://bitbucket.org/sibnick/inplacearray.git

Original thread:

https://forum.dlang.org/post/msrlumbobhpuljvhw...@forum.dlang.org


Re: Our Sister

2016-05-26 Thread H. S. Teoh via Digitalmars-d
On Thu, May 26, 2016 at 04:24:10PM -0400, Andrei Alexandrescu via Digitalmars-d 
wrote:
> On 05/26/2016 02:44 PM, Seb wrote:
> > If you want RCStr to be adapted it has to be a drop-in replacement
> > for string.
> 
> With all the criticism leveled against string, I thought more of the
> opposite. This is an opportunity to get it right. -- Andrei

I'm not sure what criticism you're referring to. The only one I can
think of is autodecoding, which isn't really an inherent part of string
being immutable(char)[], which I think is a fine idea.


T

-- 
The most powerful one-line C program: #include "/dev/tty" -- IOCCC


Re: Our Sister

2016-05-26 Thread Seb via Digitalmars-d

On Thursday, 26 May 2016 at 21:42:31 UTC, jmh530 wrote:
On Thursday, 26 May 2016 at 20:24:10 UTC, Andrei Alexandrescu 
wrote:

On 05/26/2016 02:44 PM, Seb wrote:
If you want RCStr to be adapted it has to be a drop-in 
replacement for

string.


With all the criticism leveled against string, I thought more 
of the opposite. This is an opportunity to get it right. -- 
Andrei


Hmm, I think it would be better to be right than necessarily a 
drop-in. I think the idea is so that you could change

alias string = immutable(char)[];
to something using RCString and there would be minimal 
breakages.


Oh yes that's what I meant. Sorry for being so confusing.
__Right__ is way more important than breakages. For that we have 
`dfix`.


Re: Our Sister

2016-05-26 Thread jmh530 via Digitalmars-d
On Thursday, 26 May 2016 at 20:24:10 UTC, Andrei Alexandrescu 
wrote:

On 05/26/2016 02:44 PM, Seb wrote:
If you want RCStr to be adapted it has to be a drop-in 
replacement for

string.


With all the criticism leveled against string, I thought more 
of the opposite. This is an opportunity to get it right. -- 
Andrei


Hmm, I think it would be better to be right than necessarily a 
drop-in. I think the idea is so that you could change

alias string = immutable(char)[];
to something using RCString and there would be minimal breakages.


Re: Our Sister

2016-05-26 Thread Andrei Alexandrescu via Digitalmars-d

On 05/26/2016 04:32 PM, Bastiaan Veelo wrote:

* Would it support implicit sharing (copy-on-write)? What about
sub-strings?


Yes, COW. Substrings will be managed COW-ish as well (no copy upon 
substring extraction).



* Will concatenations be fast?


No, it will copy (i.e. no multiple segments management). It will be of 
course optimized as much as we can.



* Would this have value for compile time string operations, mixin's, etc.?


Not planned.


RFC: what primitives should RCStr have?


String may have a few that are worth supporting:
http://doc.qt.io/qt-5/qstring.html


Good list. Thanks!


Andrei



Re: Our Sister

2016-05-26 Thread Bastiaan Veelo via Digitalmars-d
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu 
wrote:
I've been working on RCStr (endearingly pronounced "Our 
Sister"), D's up-and-coming reference counted string type. The 
goals are:


* Reference counted, shouldn't leak if all instances destroyed; 
even if not, use the GC as a last-resort reclamation mechanism.


* Entirely @safe.

* Support UTF 100% by means of RCStr!char, RCStr!wchar etc. but 
also raw manipulation and custom encodings via RCStr!ubyte, 
RCStr!ushort etc.


* Support several views of the same string, e.g. given s of 
type RCStr!char, it can be iterated byte-wise, code point-wise, 
code unit-wise etc. by using s.by!ubyte, s.by!char, s.by!dchar 
etc.


* Support const and immutable qualifiers for the character type.

* Work well with const and immutable when they qualify the 
entire RCStr type.


* Fast: use the small string optimization and various other 
layout and algorithms to make it a good choice for high 
performance strings



Interesting! I few noob questions first:

* Would it support implicit sharing (copy-on-write)? What about 
sub-strings?


* Will concatenations be fast?

* Would this have value for compile time string operations, 
mixin's, etc.?




RFC: what primitives should RCStr have?


String may have a few that are worth supporting: 
http://doc.qt.io/qt-5/qstring.html


Bastiaan.


Re: Our Sister

2016-05-26 Thread Andrei Alexandrescu via Digitalmars-d

On 05/26/2016 02:44 PM, Seb wrote:

If you want RCStr to be adapted it has to be a drop-in replacement for
string.


With all the criticism leveled against string, I thought more of the 
opposite. This is an opportunity to get it right. -- Andrei


Re: Our Sister [i.e. RCStr/RCString]

2016-05-26 Thread Jonathan M Davis via Digitalmars-d
On Thursday, May 26, 2016 16:20:37 Adam D. Ruppe via Digitalmars-d wrote:
> On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu
>
> wrote:
> > I've been working on RCStr (endearingly pronounced "Our Sister")
>
> You really should actually mention RCStr in the subject line so
> people overwhelmed with the staggering amount of off topic
> chatter on this forum don't disregard this thread too.

Yeah. I was about to ignore this thread as being clearly OT until I saw that
it was started by Andrei.

- Jonathan M Davis



Re: Our Sister

2016-05-26 Thread Jonathan M Davis via Digitalmars-d
On Thursday, May 26, 2016 17:50:36 Adam D. Ruppe via Digitalmars-d wrote:
> Would an RCStr pass isSomeString? I kinda think it shouldn't.
> Actually, isSomeString probably shouldn't often be used - instead
> checking for string-like range capabilities is likely better for
> algorithms. Then doing some_algorithm(my_rcstr) fails - you must
> do some_algorithm(my_rcstr.some_range)

RCStr definitely should _not_ pass isSomeString. Those traits specifically
work only for the built-in types and not for stuff that acts like them. It's
a disaster waiting to happen otherwise. We need to distinguish between
testing for something that is a string and something that acts like one.

- Jonathan M Davis



Re: Our Sister

2016-05-26 Thread jmh530 via Digitalmars-d

On Thursday, 26 May 2016 at 18:44:42 UTC, Seb wrote:


Great news!
I think one can't stress this enough: If you want RCStr to be 
adapted it has to be a drop-in replacement for string.


Maybe we can bundle the transition from auto-decoding with the 
adaption to a RCString. There was the proposal of having String 
without auto-decoding for this migration.


I like these ideas (and RCString over RCStr).


Re: Our Sister

2016-05-26 Thread Seb via Digitalmars-d

On Thursday, 26 May 2016 at 17:45:15 UTC, Xinok wrote:
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu 
wrote:
I've been working on RCStr (endearingly pronounced "Our 
Sister"), D's up-and-coming reference counted string type. The 
goals are:

...


I don't know how practical this would be, but if at all 
feasible, I think one of the goals should be to have a common 
interface/primitives with regular strings so we can write 
generic functions which accept both native strings and RCStr.


Great news!
I think one can't stress this enough: If you want RCStr to be 
adapted it has to be a drop-in replacement for string.


Maybe we can bundle the transition from auto-decoding with the 
adaption to a RCString. There was the proposal of having String 
without auto-decoding for this migration.


Re: Our Sister

2016-05-26 Thread Joakim via Digitalmars-d

On Thursday, 26 May 2016 at 16:20:37 UTC, Adam D. Ruppe wrote:
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu 
wrote:
I've been working on RCStr (endearingly pronounced "Our 
Sister")


You really should actually mention RCStr in the subject line so 
people overwhelmed with the staggering amount of off topic 
chatter on this forum don't disregard this thread too.


Where do you see all this "chatter?"  Looking at the topics for 
the last 10 days, I only see one not about D generally, and it's 
labeled OT.


Re: Our Sister

2016-05-26 Thread Jack Stouffer via Digitalmars-d

On Thursday, 26 May 2016 at 17:50:36 UTC, Adam D. Ruppe wrote:
That would be templated so like byUTF!char and byUTF!wchar 
right?


Then byCodePoint can just be another name for byUTF!dchar. I 
kinda like that.


Ideally, the string type would also use lazy imports for any 
conversion table. So if you never call byGrapheme, it never 
imports the std.uni tables. (Heck, std.uni could be the one to 
provide that type, of course.)


This has the added benefit that it would automatically work with 
a lot of generic code that uses those functions.



Would an RCStr pass isSomeString? I kinda think it shouldn't.


I agree, it shouldn't. isSomeString should only test for one of 
the language provided string types.




Re: Our Sister

2016-05-26 Thread Adam D. Ruppe via Digitalmars-d

On Thursday, 26 May 2016 at 17:32:33 UTC, Jack Stouffer wrote:
Well, because we already have the standard library functions 
representation, byUTF



That would be templated so like byUTF!char and byUTF!wchar right?

Then byCodePoint can just be another name for byUTF!dchar. I 
kinda like that.


Ideally, the string type would also use lazy imports for any 
conversion table. So if you never call byGrapheme, it never 
imports the std.uni tables. (Heck, std.uni could be the one to 
provide that type, of course.)



Would an RCStr pass isSomeString? I kinda think it shouldn't. 
Actually, isSomeString probably shouldn't often be used - instead 
checking for string-like range capabilities is likely better for 
algorithms. Then doing some_algorithm(my_rcstr) fails - you must 
do some_algorithm(my_rcstr.some_range)


Re: Our Sister

2016-05-26 Thread Andrei Alexandrescu via Digitalmars-d

On 05/26/2016 12:58 PM, Gary Willoughby wrote:

On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu wrote:

* Support several views of the same string, e.g. given s of type
RCStr!char, it can be iterated byte-wise, code point-wise, code
unit-wise etc. by using s.by!ubyte, s.by!char, s.by!dchar etc.


Will s.by!Grapheme be supported too?


Yes. -- Andrei


Re: Our Sister

2016-05-26 Thread Jack Stouffer via Digitalmars-d
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu 
wrote:

* Support const and immutable qualifiers for the character type.


How is that going BTW. Last I heard you were having problems with 
inout/const.


* Support several views of the same string, e.g. given s of 
type RCStr!char, it can be iterated byte-wise, code point-wise, 
code unit-wise etc. by using s.by!ubyte, s.by!char, s.by!dchar 
etc.


[snip]

RFC: what primitives should RCStr have?


Well, because we already have the standard library functions 
representation, byUTF, byCodePoint, byCodeUnit, and byGrapheme, I 
think RCStr should provide these names as methods which all 
return ranges. If possible, these would all work regardless of 
character or integer type of the data. So in effect, RCStr would 
have completely encapsulated data. Let's not make the same 
mistake that we made with string et al. by providing a default.


If at all possible, it would be great if it was also an output 
range.



RCStr


*bikeshedding*: How about RCString, because the convention for D 
names is to be explicit most of the time.


Re: Our Sister

2016-05-26 Thread ixid via Digitalmars-d

On Thursday, 26 May 2016 at 16:20:37 UTC, Adam D. Ruppe wrote:
On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu 
wrote:
I've been working on RCStr (endearingly pronounced "Our 
Sister")


You really should actually mention RCStr in the subject line so 
people overwhelmed with the staggering amount of off topic 
chatter on this forum don't disregard this thread too.


To be fair using a forum called 'General' for technical 
discussion is asking for trouble. We will be able to tell when D 
actually starts to become popular because this part of the forum 
will cease to function as it's inundated with newbies who expect 
it to mean general questions or something similar.


Our Sister

2016-05-26 Thread Andrei Alexandrescu via Digitalmars-d
I've been working on RCStr (endearingly pronounced "Our Sister"), D's 
up-and-coming reference counted string type. The goals are:


* Reference counted, shouldn't leak if all instances destroyed; even if 
not, use the GC as a last-resort reclamation mechanism.


* Entirely @safe.

* Support UTF 100% by means of RCStr!char, RCStr!wchar etc. but also raw 
manipulation and custom encodings via RCStr!ubyte, RCStr!ushort etc.


* Support several views of the same string, e.g. given s of type 
RCStr!char, it can be iterated byte-wise, code point-wise, code 
unit-wise etc. by using s.by!ubyte, s.by!char, s.by!dchar etc.


* Support const and immutable qualifiers for the character type.

* Work well with const and immutable when they qualify the entire RCStr 
type.


* Fast: use the small string optimization and various other layout and 
algorithms to make it a good choice for high performance strings


RFC: what primitives should RCStr have?


Thanks,

Andrei