date:20140321

Re: Troubles with taskPool.amap, lambdas and more

2014-03-21 Thread Vladimir Panteleev


On Wednesday, 19 March 2014 at 00:13:10 UTC, bearophile wrote:
A problem (that is not a regression) is that taskPool.amap 
doesn't seem able to accept a lambda for some reason.


The reason for that is that any function in D currently can have 
at most one context pointer. For class and struct methods, that 
is the this pointer. For free functions which take a lambda, 
such as map, reduce and other std.algorithm functions, it is 
the context of the lambda (a pointer to the containing function's 
stack frame or whatnot).


You can't have both.

I think this is a glaring design problem in std.parallelism.

But even using a normal static inner function, the program 
asserts most times at run-time (but not always), while few 
months ago it used to work reliably. So perhaps in this messy 
situation there's some material for bug reports. Opinions and 
suggestions are welcome.


Can you perform a regression test, or post the asserting program?

Re: Function to print a diamond shape

2014-03-21 Thread Jay Norwood


On Friday, 21 March 2014 at 00:31:58 UTC, bearophile wrote:

This is a somewhat common little exercise: Write a function


Bye,
bearophile


I like that replicate but easier for me to keep track of the 
counts if I work from the center.


int blanks[];
blanks.length = n;
int stars[];
stars.length = n;

int c = n/2; // center of diamond
int cp1 = c+1;
blanks[c]=0;
stars[c]=n;

// calculate stars and blanks in each row
for(int i=1; icp1; i++){
blanks[c-i] = blanks[c+i] = i;
stars[c-i] = stars[c+i] = n - (i*2);
}

for (int i=0; in; i++){
write( .replicate(blanks[i]));
writeln(*.replicate(stars[i]));
}

Re: GC allocation issue

2014-03-21 Thread monarch_dodra


On Friday, 21 March 2014 at 00:56:22 UTC, Etienne wrote:
I'm trying to store a copy of strings for long-running 
processes with malloc. I tried using emplace but the copy gets 
deleted by the GC. Any idea why?


Could you show the snippet where you used emplace? I'd like to 
know how you are using it. In particular, where you are 
emplacing, and *what*: the slice, or the slice contents?

Re: Template with template?

2014-03-21 Thread Chris


On Thursday, 20 March 2014 at 20:20:28 UTC, John Colvin wrote:

On Thursday, 20 March 2014 at 19:38:25 UTC, Chris wrote:

On Thursday, 20 March 2014 at 18:54:30 UTC, John Colvin wrote:

On Thursday, 20 March 2014 at 18:39:32 UTC, Chris wrote:
On Thursday, 20 March 2014 at 17:49:52 UTC, John Colvin 
wrote:

On Thursday, 20 March 2014 at 16:40:50 UTC, Chris wrote:
On Thursday, 20 March 2014 at 16:32:34 UTC, Vladimir 
Panteleev wrote:

On Thursday, 20 March 2014 at 16:28:46 UTC, Chris wrote:
How can I instantiate Person with Trait, i.e. a template 
with a template?


struct Trait(T0, T1) {
T0 name;
T1 value;
T1[T0] map;

this(T0 name, T1 value) {
this.name = name;
this.value = value;
map[name] = value;
}
}

class Person(T) {
T traits[];

void addTrait(T trait) {
traits ~= trait;
}
}


void main()
{
auto trait1 = Trait!(string, string)(Name, John);
auto trait2 = Trait!(string, int)(Age, 42);
writefln(%s, trait1.map);
writefln(%s, trait2.map);
// above code compiles and works
}


Person!(Trait!(string, string)) person;

-- or --

alias MyTrait = Trait!(string, string);
Person!MyTrait person;

Note that this approach won't let you have traits with 
different parameters within the same Person type.


Yep, I've already tried this (sorry I should've mentioned 
it!). But I don't want this restriction.


Arrays are homogeneous. All the elements must be of the 
same type. Different instantiations of templates are 
different types.


You could use an array of std.variant.Variant


The elements are all of type Trait. However, Type itself 
might be

of different types. That's why it is not possible? I've come
across this restriction before when using templates, which 
is a
big disappointment because it restricts the templatization 
/

generalization of data structures somewhat.


Trait is not a type. Trait is a template. An instantiation of 
the Trait template is a type.



Arrays are contiguous, homogeneous data. This is fundamental 
to their design and their performance characteristics.
Workarounds use at least one of the following: indirection, 
tagging* and padding. Variant uses tagging and padding. 
Interface/base-class arrays use indirection (and tagging, 
ultimately).


*inline or external, or even compile-time.

This is true in *every* programming language, just with 
different names.


I thought the array T[] traits could hold any _type_ the 
template Trait is instantiated into. That's where I got it 
wrong. I understand the practical aspects of this restriction 
(homogeneous data, performance and the work around involved 
etc.). However, this makes templates less universal and rather 
cumbersome to work with in certain circumstances. Take for 
example the Person class. If I want to do numerical operations 
with the age of the person, I will have to convert the age to 
an int (or whatever) later on instead of just doing it once at 
the beginning (when loading data). So everytime I access 
Trait.map[age] I will have to convert it to a number before 
I can calculate anything. This, or I store it in a field of 
its own when instantiating Trait. Whatever workaround I choose 
it will make it less elegant and less simple.


Maybe I expect(ed) to much of templates. Mea culpa.


Try this:

import std.stdio;
import std.variant;

enum maxTraitSize = 64;

struct Trait(T0, T1)
{
T0 name;
T1 value;
T1[T0] map;

	static assert(T0.sizeof + T1.sizeof + (T1[T0]).sizeof = 
maxTraitSize);


this(T0 name, T1 value)
{
this.name = name;
this.value = value;
map[name] = value;
}
}

class Person
{
alias ElT = VariantN!maxTraitSize;
ElT[] traits;

void addTrait(T)(T trait)
if(is(T == Trait!Q, Q...))
{
traits ~= ElT(trait);
}
}

void main()
{
auto trait1 = Trait!(string, string)(Name, John);
auto trait2 = Trait!(string, int)(Age, 42);
writefln(%s, trait1.map);
writefln(%s, trait2.map);

auto p = new Person;
p.addTrait(trait1);
p.addTrait(trait2);
writeln(p.traits);
}


Thanks John, this does what I had in mind. I don't know, though, 
if I will use it for a real world application. I would have to 
test the behavior thoroughly first. The background is that I will 
have variable user input, i.e. users (non-programmers) define 
traits and rules. There is no way to foresee what names users 
will choose. That's why Trait stores arbitrary keys and values 
(and Meta's solution is not an option). I only thought it would 
be nice to have [string:int] straight away for numerical 
operations further down the road. I was also thinking about using 
std.variant. But I'm just starting out with this ...

Re: Template with template?

2014-03-21 Thread Chris


On Friday, 21 March 2014 at 09:54:24 UTC, Chris wrote:

On Thursday, 20 March 2014 at 20:20:28 UTC, John Colvin wrote:

On Thursday, 20 March 2014 at 19:38:25 UTC, Chris wrote:

On Thursday, 20 March 2014 at 18:54:30 UTC, John Colvin wrote:

On Thursday, 20 March 2014 at 18:39:32 UTC, Chris wrote:
On Thursday, 20 March 2014 at 17:49:52 UTC, John Colvin 
wrote:

On Thursday, 20 March 2014 at 16:40:50 UTC, Chris wrote:
On Thursday, 20 March 2014 at 16:32:34 UTC, Vladimir 
Panteleev wrote:

On Thursday, 20 March 2014 at 16:28:46 UTC, Chris wrote:
How can I instantiate Person with Trait, i.e. a 
template with a template?


struct Trait(T0, T1) {
T0 name;
T1 value;
T1[T0] map;

this(T0 name, T1 value) {
this.name = name;
this.value = value;
map[name] = value;
}
}

class Person(T) {
T traits[];

void addTrait(T trait) {
traits ~= trait;
}
}


void main()
{
auto trait1 = Trait!(string, string)(Name, John);
auto trait2 = Trait!(string, int)(Age, 42);
writefln(%s, trait1.map);
writefln(%s, trait2.map);
// above code compiles and works
}


Person!(Trait!(string, string)) person;

-- or --

alias MyTrait = Trait!(string, string);
Person!MyTrait person;

Note that this approach won't let you have traits with 
different parameters within the same Person type.


Yep, I've already tried this (sorry I should've mentioned 
it!). But I don't want this restriction.


Arrays are homogeneous. All the elements must be of the 
same type. Different instantiations of templates are 
different types.


You could use an array of std.variant.Variant


The elements are all of type Trait. However, Type itself 
might be

of different types. That's why it is not possible? I've come
across this restriction before when using templates, which 
is a
big disappointment because it restricts the 
templatization /

generalization of data structures somewhat.


Trait is not a type. Trait is a template. An instantiation 
of the Trait template is a type.



Arrays are contiguous, homogeneous data. This is fundamental 
to their design and their performance characteristics.
Workarounds use at least one of the following: indirection, 
tagging* and padding. Variant uses tagging and padding. 
Interface/base-class arrays use indirection (and tagging, 
ultimately).


*inline or external, or even compile-time.

This is true in *every* programming language, just with 
different names.


I thought the array T[] traits could hold any _type_ the 
template Trait is instantiated into. That's where I got it 
wrong. I understand the practical aspects of this restriction 
(homogeneous data, performance and the work around involved 
etc.). However, this makes templates less universal and 
rather cumbersome to work with in certain circumstances. Take 
for example the Person class. If I want to do numerical 
operations with the age of the person, I will have to convert 
the age to an int (or whatever) later on instead of just 
doing it once at the beginning (when loading data). So 
everytime I access Trait.map[age] I will have to convert it 
to a number before I can calculate anything. This, or I store 
it in a field of its own when instantiating Trait. Whatever 
workaround I choose it will make it less elegant and less 
simple.


Maybe I expect(ed) to much of templates. Mea culpa.


Try this:

import std.stdio;
import std.variant;

enum maxTraitSize = 64;

struct Trait(T0, T1)
{
T0 name;
T1 value;
T1[T0] map;

	static assert(T0.sizeof + T1.sizeof + (T1[T0]).sizeof = 
maxTraitSize);


this(T0 name, T1 value)
{
this.name = name;
this.value = value;
map[name] = value;
}
}

class Person
{
alias ElT = VariantN!maxTraitSize;
ElT[] traits;

void addTrait(T)(T trait)
if(is(T == Trait!Q, Q...))
{
traits ~= ElT(trait);
}
}

void main()
{
auto trait1 = Trait!(string, string)(Name, John);
auto trait2 = Trait!(string, int)(Age, 42);
writefln(%s, trait1.map);
writefln(%s, trait2.map);

auto p = new Person;
p.addTrait(trait1);
p.addTrait(trait2);
writeln(p.traits);
}


Thanks John, this does what I had in mind. I don't know, 
though, if I will use it for a real world application. I would 
have to test the behavior thoroughly first. The background is 
that I will have variable user input, i.e. users 
(non-programmers) define traits and rules. There is no way to 
foresee what names users will choose. That's why Trait stores 
arbitrary keys and values (and Meta's solution is not an 
option). I only thought it would be nice to have [string:int] 
straight away for numerical operations further down the road. I 
was also thinking about using std.variant. But I'm just 
starting out with this ...


Btw, I was initially inspired by Objective-C's NSSet that can 
hold arbitrary objects.

Re: Function to print a diamond shape

2014-03-21 Thread Sergei Nosov


On Thursday, 20 March 2014 at 21:25:03 UTC, Ali Çehreli wrote:
This is a somewhat common little exercise: Write a function 
that takes the size of a diamond and produces a diamond of that 
size.


When printed, here is the output for size 11:

 *
***
   *
  ***
 *
***
 *
  ***
   *
***
 *

What interesting, boring, efficient, slow, etc. ways are there?

Ali


Probably, the most boring way is

foreach(i; 0..N)
{
foreach(j; 0..N)
write( *[i + j = N/2  i + j  3*N/2  i - j = N/2 
 j - i = N/2]);

writeln;
}

Re: Function to print a diamond shape

2014-03-21 Thread Jay Norwood

This one calculates, then outputs subranges of the ba and sa char 
arrays.


int n = 11;
int blanks[];
blanks.length = n;
int stars[];
stars.length = n;
char ba[];
ba.length = n;
ba[] = ' '; // fill full ba array
char sa[];
sa.length = n;
sa[] = '*'; // fill full sa array

int c = n/2; // center of diamond
int cp1 = c+1;
blanks[c]=0;
stars[c]=n;

// calculate stars and blanks in each row
for(int i=1; icp1; i++){
blanks[c-i] = blanks[c+i] = i;
stars[c-i] = stars[c+i] = n - (i*2);
}

// output subranges of the ba and sa char arrays
for (int i=0; in; i++){
write(ba[$-blanks[i]..$]);
writeln(sa[$-stars[i]..$]);
}

Re: Function to print a diamond shape

2014-03-21 Thread Andrea Fontana


On Friday, 21 March 2014 at 12:32:58 UTC, Sergei Nosov wrote:

Probably, the most boring way is

foreach(i; 0..N)
{
foreach(j; 0..N)
write( *[i + j = N/2  i + j  3*N/2  i - j = 
N/2  j - i = N/2]);

writeln;
}


A single foreach(i; 0..N*N) is more boring!

Re: GC allocation issue

2014-03-21 Thread Etienne


On 2014-03-21 2:53 AM, monarch_dodra wrote:

On Friday, 21 March 2014 at 00:56:22 UTC, Etienne wrote:

I'm trying to store a copy of strings for long-running processes with
malloc. I tried using emplace but the copy gets deleted by the GC. Any
idea why?


Could you show the snippet where you used emplace? I'd like to know
how you are using it. In particular, where you are emplacing, and
*what*: the slice, or the slice contents?


https://github.com/globecsys/cache.d/blob/master/chd/table.d#L1089
This line does the copying

I don't think it's the memory copying algorithm anymore however. The GC 
crashes altogether during fullcollect(), the logs give me this:


	cache-d_d.exe!gc@gc@Gcx@mark(void * this, void * nRecurse, int ptop) 
Line 2266	C++

cache-d_d.exe!gc@gc@Gcx@mark(void * this, void * ptop) Line 2249
C++
cache-d_d.exe!gc@gc@Gcx@fullcollect() Line 2454 C++
 	cache-d_d.exe!gc@gc@GC@mallocNoSync(unsigned int this, unsigned int 
alloc_size, unsigned int * alloc_size) Line 458	C++
 	cache-d_d.exe!gc@gc@GC@malloc(unsigned int this, unsigned int 
alloc_size, unsigned int * bits) Line 413	C++

...


With ptop= 03D8F030, pbot= 03E4F030

They both point invalid memory. It looks like a really wide range too, 
the usual would be 037CCB80 - 037CCBA0 or such. I don't know how to 
find out where they come from... Maybe I could do an assert on that 
specific value in druntime

Re: Function to print a diamond shape

2014-03-21 Thread Vladimir Panteleev


On Friday, 21 March 2014 at 12:32:58 UTC, Sergei Nosov wrote:

On Thursday, 20 March 2014 at 21:25:03 UTC, Ali Çehreli wrote:
This is a somewhat common little exercise: Write a function 
that takes the size of a diamond and produces a diamond of 
that size.


When printed, here is the output for size 11:

*
   ***
  *
 ***
*
***
*
 ***
  *
   ***
*

What interesting, boring, efficient, slow, etc. ways are there?

Ali


Probably, the most boring way is

foreach(i; 0..N)
{
foreach(j; 0..N)
write( *[i + j = N/2  i + j  3*N/2  i - j = 
N/2  j - i = N/2]);


write( *[abs(i-N/2) + abs(j-N/2) = N/2]);


writeln;
}

Re: GC allocation issue

2014-03-21 Thread Etienne


On 2014-03-21 9:36 AM, Etienne wrote:

With ptop= 03D8F030, pbot= 03E4F030

They both point invalid memory. It looks like a really wide range too,
the usual would be 037CCB80 - 037CCBA0 or such. I don't know how to
find out where they come from... Maybe I could do an assert on that
specific value in druntime


Looks like the range of the string[] keys array, it gets pretty big 
after adding 1s of strings.


+GC.addRange(p = 03EA0AB0, sz = 0x38), p + sz = 03EA0AE8
set: 209499732595 = ¨98303126
+GC.addRange(p = 03EA0B40, sz = 0x38), p + sz = 03EA0B78
set: 6491851329 = ¨50107378
+GC.addRange(p = 03EA0BD0, sz = 0x38), p + sz = 03EA0C08
set: 262797465895 = ¨14438090
+GC.addRange(p = 03EA0C60, sz = 0x38), p + sz = 03EA0C98
set: 95992076217 = ¨65000864
+GC.addRange(p = 03EA0CF0, sz = 0x38), p + sz = 03EA0D28
+GC.addRange(p = 03EA0D50, sz = 0x3), p + sz = 03ED0D50

It crashes when sz approaches 0x18, it looks like (my best guess) 
the resized array doesn't get allocated but the GC still tries to scan it.

Re: Troubles with taskPool.amap, lambdas and more

2014-03-21 Thread bearophile


Vladimir Panteleev:


I think this is a glaring design problem in std.parallelism.


Do you want to report the issue in Bugzilla?


Can you perform a regression test, or post the asserting 
program?


It's the first program here (in the meantime I have updated it):
http://rosettacode.org/wiki/Parallel_calculations#D

Bye,
bearophile

Re: Function to print a diamond shape

2014-03-21 Thread Sergei Nosov

On Friday, 21 March 2014 at 13:59:27 UTC, Vladimir Panteleev 
wrote:

On Friday, 21 March 2014 at 12:32:58 UTC, Sergei Nosov wrote:

On Thursday, 20 March 2014 at 21:25:03 UTC, Ali Çehreli wrote:
This is a somewhat common little exercise: Write a function 
that takes the size of a diamond and produces a diamond of 
that size.


When printed, here is the output for size 11:

   *
  ***
 *
***
*
***
*
***
 *
  ***
   *

What interesting, boring, efficient, slow, etc. ways are 
there?


Ali


Probably, the most boring way is

foreach(i; 0..N)
{
   foreach(j; 0..N)
   write( *[i + j = N/2  i + j  3*N/2  i - j = 
N/2  j - i = N/2]);


write( *[abs(i-N/2) + abs(j-N/2) = N/2]);


   writeln;
}


Beat me. Yours is even more boring. =)

Re: GC allocation issue

2014-03-21 Thread Etienne


On 2014-03-21 10:34 AM, Etienne wrote:

It crashes when sz approaches 0x18, it looks like (my best guess)
the resized array doesn't get allocated but the GC still tries to scan it.


Ok I found it in the manual implementation of a malloc-based HashMap.

The right way to debug this was, sadly, to add a lot of printf and a few 
asserts in druntime, and redirecting the stdout to a file from the shell 
(./exe  logoutput.txt). The druntime win32.mak doesn't have a debug 
build so I had to add -debug -g in there to add symbols and make the 
sources show up instead of the disassembly in VisualD.


In this case, the logs showed gc's mark() was failing on wide ranges, so 
I added an assert in addRange to make it throw when that range was 
added, and it finally gave me the call stack of the culprit.


The issue was that a malloc range was (maybe) not being properly 
initialized before being added to the GC.


https://github.com/rejectedsoftware/vibe.d/blob/master/source/vibe/utils/hashmap.d#L221

https://github.com/rejectedsoftware/vibe.d/blob/master/source/vibe/utils/memory.d#L153

In this case, ptr isn't null and the range existed, but there's still an 
access violation from the GC for some reason. I'll keep searching for 
the root cause but it doesn't seem to be a GC issue anymore; though the 
debugging procedure could use some documentation.


Thanks

Ranges/algorithms for aggregation

2014-03-21 Thread Luís.Marques

Is there a neat way to do this transformation with ranges and 
std.algorithms?


Input:
---
B foo
B bar
C ble
B big
A begga

Output: (aggregated and sorted on length)
---
B - [foo, bar, big]
C - [ble]
A - [begga]

The most obvious way (to me) to do this without standard 
algorithms is with an AA to the aggregation. The most obvious way 
(to me) to do this with std.algorithms is:


B foo
B bar
C ble
B big
A begga

=

[B, foo]
[B, bar]
[C, ble]
[B, big]
[A, begga]

=

[B, foo]
[B, bar]
[B, big]
[C, ble]
[A, begga]

=

B - [foo, bar, big]
C - [ble]
A - [begga]

But this seems wasteful on memory. Is there a better way to do 
this in a more algorithmic way?

Re: Ranges/algorithms for aggregation

2014-03-21 Thread bearophile


Luís Marques:

Is there a neat way to do this transformation with ranges and 
std.algorithms?


Input:
---
B foo
B bar
C ble
B big
A begga

Output: (aggregated and sorted on length)
---
B - [foo, bar, big]
C - [ble]
A - [begga]


What is the desired output data structure? An associative array 
of dynamic arrays? Or is a dynamic arrays of dynamic arrays of 
2-tuples enough?


There are various ways to solve your problem.

Related:
https://d.puremagic.com/issues/show_bug.cgi?id=5968
https://d.puremagic.com/issues/show_bug.cgi?id=9842

Bye,
bearophile

Re: Ranges/algorithms for aggregation

2014-03-21 Thread Luís.Marques


On Friday, 21 March 2014 at 15:38:23 UTC, bearophile wrote:

   Output: (aggregated and sorted on length)
   ---
   B - [foo, bar, big]
   C - [ble]
   A - [begga]


What is the desired output data structure? An associative array 
of dynamic arrays? Or is a dynamic arrays of dynamic arrays of 
2-tuples enough?


I'm doing this for a D newbie, to teach him the range/algorithmic 
approach. The function he wrote to output the result of that 
transformation takes as an input an array of arrays (the latter), 
but he builds that input iteratively using an AA of arrays (the 
former). I asked him about that  mismatch and at the time he told 
me that for now he only needed the latter, suggesting he had 
other future plans where he might need the AA, but I'm not sure. 
So let's just say that the client is unclear on his requirements, 
which does happen in the real world anyway :-).


In any case, I think the hashGroupBy is what I was asking about 
:-). Neat. (did anyone actually implement it?)


I'm not sure how if a dynamic arrays of dynamic arrays of 
2-tuples sufficed that would help with the intermediate step, if 
we wanted to avoid the sorting step. Did you have anything in 
particular in mind there?

Re: Ranges/algorithms for aggregation

2014-03-21 Thread bearophile


Luís Marques:

So let's just say that the client is unclear on his 
requirements, which does happen in the real world anyway :-).


Yes, it happens, But it's a problem, because often if you know 
what you need you can produce the results more efficiently :-)




(did anyone actually implement it?)


Not yet.


I'm not sure how if a dynamic arrays of dynamic arrays of 
2-tuples sufficed that would help with the intermediate step, 
if we wanted to avoid the sorting step. Did you have anything 
in particular in mind there?


I think this problem needs a sorting or a hash. One possible 
solution, if you don't need an associative array as output, is to 
use a multiSort followed by a building of groups using slicing. 
It could be efficient enough. Later you search the keys with a 
some kind of binary search.


Bye,
bearophile

Re: Ranges/algorithms for aggregation

2014-03-21 Thread Luís.Marques


On Friday, 21 March 2014 at 16:04:45 UTC, bearophile wrote:
I think this problem needs a sorting or a hash. One possible 
solution, if you don't need an associative array as output, is 
to use a multiSort followed by a building of groups using 
slicing. It could be efficient enough. Later you search the 
keys with a some kind of binary search.


The number of keys is large and unbounded (not just three as in 
my example), so I guess this multiSort approach would not be 
practical, right? I think we really need the hashGroupBy.

Re: Ranges/algorithms for aggregation

2014-03-21 Thread Justin Whear

On Fri, 21 Mar 2014 15:22:13 +, Luís Marques wrote:

 Is there a neat way to do this transformation with ranges and
 std.algorithms?
 
  Input:
  ---
  B foo B bar C ble B big A begga
 
  Output: (aggregated and sorted on length)
  ---
  B - [foo, bar, big]
  C - [ble]
  A - [begga]
 
 The most obvious way (to me) to do this without standard algorithms is
 with an AA to the aggregation. The most obvious way (to me) to do this
 with std.algorithms is:
 
  B foo B bar C ble B big A begga
 
  =
 
  [B, foo]
  [B, bar]
  [C, ble]
  [B, big]
  [A, begga]
 
  =
 
  [B, foo]
  [B, bar]
  [B, big]
  [C, ble]
  [A, begga]
 
  =
 
  B - [foo, bar, big]
  C - [ble]
  A - [begga]
 
 But this seems wasteful on memory. Is there a better way to do this in a
 more algorithmic way?

This pull request[1] for groupBy has been hanging around for a year now, 
driving me to copy-and-paste the implementation into a couple of my 
projects.  Using it, you could do this:

auto tuples = ... // get your list of (B, foo), (B, bar), etc.
auto output = tuples.sort!`a[0]  b[0]`
.groupBy!`a[0] == b[0]`;
// output is a range of:
//[
// [(A, begga)],
// [(B, foo), (B, bar), (B, big)],
// [(C, ble)]
//]

The advantage being that output isn't an array at all but a lazy range of 
lazy ranges.

1 https://github.com/D-Programming-Language/phobos/pull/1186

Re: Ranges/algorithms for aggregation

2014-03-21 Thread H. S. Teoh

On Fri, Mar 21, 2014 at 04:10:12PM +, Justin Whear wrote:
[...]
 This pull request[1] for groupBy has been hanging around for a year
 now, driving me to copy-and-paste the implementation into a couple of
 my projects.  Using it, you could do this:
 
 auto tuples = ... // get your list of (B, foo), (B, bar), etc.
 auto output = tuples.sort!`a[0]  b[0]`
 .groupBy!`a[0] == b[0]`;
 // output is a range of:
 //[
 // [(A, begga)],
 // [(B, foo), (B, bar), (B, big)],
 // [(C, ble)]
 //]
 
 The advantage being that output isn't an array at all but a lazy range
 of lazy ranges.
 
 1 https://github.com/D-Programming-Language/phobos/pull/1186

Be aware, though, that groupBy only compares *adjacent* elements for
equivalence; it does not sort the input. So if your input has equivalent
elements interspersed with non-equivalent elements, you will have the
equivalent elements split into multiple runs in the output.

Example:
auto data = [
tuple(1, a)
tuple(1, b)
tuple(2, c)
tuple(1, d)
];
writeln(data.groupBy!((a,b) = a[0] == b[0]));

Will output:
[[tuple(1, a), tuple(1, b)], [tuple(2, c)], [tuple(1, d)]]

Which may not be what the OP wants.


T

-- 
Unix was not designed to stop people from doing stupid things, because that 
would also stop them from doing clever things. -- Doug Gwyn

Re: Ranges/algorithms for aggregation

2014-03-21 Thread Luís.Marques


On Friday, 21 March 2014 at 16:53:46 UTC, H. S. Teoh wrote:
Be aware, though, that groupBy only compares *adjacent* 
elements for
equivalence; it does not sort the input. So if your input has 
equivalent
elements interspersed with non-equivalent elements, you will 
have the

equivalent elements split into multiple runs in the output.


I think that's why Justin used sort. The hashGroupBy proposed by 
bearophile would avoid the sort and the additional memory usage 
though, so that would be even better.

Re: Improving IO Speed

2014-03-21 Thread Marc Schütz


On Friday, 14 March 2014 at 18:00:58 UTC, TJB wrote:

align(1) struct TaqIdx
{
  align(1) char[10] symbol;
  align(1) int tdate;
  align(1) int begrec;
  align(1) int endrec;
}


Won't help with speed, but you can write it with less repetition:

align(1) struct TaqIdx
{
align(1):
  char[10] symbol;
  int tdate;
  int begrec;
  int endrec;
}

The outer align(1) is still necessary to avoid the padding.

Re: Ranges/algorithms for aggregation

2014-03-21 Thread Luís.Marques


On Friday, 21 March 2014 at 17:20:38 UTC, Luís Marques wrote:
I think that's why Justin used sort. The hashGroupBy proposed 
by bearophile would avoid the sort and the additional memory 
usage though, so that would be even better.


I was thinking, we don't even need the full power of sort. Is 
there a standard algorithm that makes elements with equal keys be 
in sequence, but that otherwise is less expensive than sort?

Re: Ranges/algorithms for aggregation

2014-03-21 Thread bearophile


Luís Marques:

I was thinking, we don't even need the full power of sort. Is 
there a standard algorithm that makes elements with equal keys 
be in sequence, but that otherwise is less expensive than sort?


I don't know any, how is it supposed to know where to group 
items? Usually you build an associative array for that.


Bye,
bearophile

Re: sizeof struct no less than 1?

2014-03-21 Thread Mike


On Saturday, 22 March 2014 at 02:27:02 UTC, Mike wrote:

Hello,

Consider an 'empty' struct as follows:

struct Test {}

When printing it's size, it's always 1.
void main()
{
writeln(Test.sizeof);
}

Output:
1

Why is it not 0?  What about struct Test is consuming 1 byte of 
memory?


Thanks,
Mike


Looks like this answers my question:
http://forum.dlang.org/post/l03oqc$mpq$1...@digitalmars.com

sizeof struct no less than 1?

2014-03-21 Thread Mike


Hello,

Consider an 'empty' struct as follows:

struct Test {}

When printing it's size, it's always 1.
void main()
{
writeln(Test.sizeof);
}

Output:
1

Why is it not 0?  What about struct Test is consuming 1 byte of 
memory?


Thanks,
Mike

Re: sizeof struct no less than 1?

2014-03-21 Thread Philpax

C++ exhibits the same behaviour for what is most likely the same 
reason: so that empty structs can be allocated without two 
distinct objects having the same memory address ( 
http://stackoverflow.com/questions/2362097/why-is-the-size-of-an-empty-class-in-c-not-zero 
)

Re: Ranges/algorithms for aggregation

2014-03-21 Thread Luís.Marques


On Saturday, 22 March 2014 at 01:08:11 UTC, bearophile wrote:
how is it supposed to know where to group items? Usually you 
build an associative array for that.


It would swap elements, like sort, so it doesn't need to put them 
anywhere, just permute them. The advantage is this:


Input: [7, 3, 7, 1, 1, 1, 1]
Output sort: [1, 1, 1, 1, 3, 7, 7]
Output groupSort: [3, 7, 7, 1, 1, 1, 1]

groupSort (or whatever it would be called) only makes one swap, 
while sort makes a lot of them. So groupSort is a lot cheaper. 
I'm not sure what the asymptotic time complexity of groupSort is, 
at this moment's notice (I guess it would depend on what strategy 
it would use).

Re: Alias template param - Undefined Identifier

2014-03-21 Thread Chris Williams


On Wednesday, 19 March 2014 at 00:07:04 UTC, Chris Williams wrote:
  ...probably something 
along the lines of making all of my functions a static function 
in a struct, which I then pass into a template which processes 
UDAs to generate functions at the top with the same names as 
the originals in the struct, which call the struct variants. 
It's also a longer to write and debug.


Here's a simple (unsafe and inflexible) implementation, should 
anyone want to build it out into something that can accept 
multiple arguments and return types.



import std.stdio;

string wrap(string wrapperName, string structName, string 
innerName) {
return void  ~ innerName ~ () { ~wrapperName~ ( 
~structName~.~innerName~ );};

}

template decorate(T) {
template ForeachMember(Mbr...) {
static if (__traits(getAttributes, __traits(getMember, T, 
Mbr[0])).length  0) {

static if (Mbr.length  1) {
enum ForeachMember =
wrap(
__traits(identifier, 
__traits(getAttributes, __traits(getMember, T, Mbr[0]))[0]),

__traits(identifier, T),
Mbr[0]
)
~ ForeachMember!(Mbr[1..$])
;
}
else {
enum ForeachMember =
wrap(
__traits(identifier, 
__traits(getAttributes, __traits(getMember, T, Mbr[0]))[0]),

__traits(identifier, T),
Mbr[0]
)
;
}
}
else {
static if (Mbr.length  1) {
enum ForeachMember =  ~ 
ForeachMember!(Mbr[1..$]);

}
else {
enum ForeachMember = ;
}
}
}

enum decorate = ForeachMember!(__traits(allMembers, T));
}

void BeforeAfter(void function() fun) {
writeln(Before);
write(\t); fun();
writeln(After);
}

void Llama(void function() fun) {
writeln(Llama);
write(\t); fun();
writeln(Llama);
}

struct Foo {
@BeforeAfter
static void hello() {
writeln(Hello);
}

@Llama
static void and() {
writeln(and);
}

@BeforeAfter
static void goodbye() {
writeln(Goodbye);
}
}

mixin(decorate!(Foo));

void main() {
   hello();
   and();
   goodbye();
}

Re: Troubles with taskPool.amap, lambdas and more

Re: Function to print a diamond shape

Re: GC allocation issue

Re: Template with template?

Re: Template with template?

Re: Function to print a diamond shape

Re: Function to print a diamond shape

Re: Function to print a diamond shape

Re: GC allocation issue

Re: Function to print a diamond shape

Re: GC allocation issue

Re: Troubles with taskPool.amap, lambdas and more

Re: Function to print a diamond shape

Re: GC allocation issue

Ranges/algorithms for aggregation

Re: Ranges/algorithms for aggregation

Re: Ranges/algorithms for aggregation

Re: Ranges/algorithms for aggregation

Re: Ranges/algorithms for aggregation

Re: Ranges/algorithms for aggregation

Re: Ranges/algorithms for aggregation

Re: Ranges/algorithms for aggregation

Re: Improving IO Speed

Re: Ranges/algorithms for aggregation

Re: Ranges/algorithms for aggregation

Re: sizeof struct no less than 1?

sizeof struct no less than 1?

Re: sizeof struct no less than 1?

Re: Ranges/algorithms for aggregation

Re: Alias template param - Undefined Identifier

30 matches

Site Navigation

Mail list logo

Footer information