Re: Profiling

2021-02-11 Thread Imperatorn via Digitalmars-d-learn

On Wednesday, 10 February 2021 at 23:42:31 UTC, mw wrote:

On Wednesday, 10 February 2021 at 11:52:51 UTC, JG wrote:


As a follow up question I would like to know what tool people 
use to profile d programs?


I use this one:

https://code.dlang.org/packages/profdump

e.g.

```
dub build --build=debug --build=profile

# run your program to generate trace.log

profdump -b trace.log trace.log.b
profdump -f --dot --threshold 1 trace.log trace.log.dot
echo 'view it with: xdot trace.log.dot'
```


Nice, didn't even know that existed


Re: Trying to reduce memory usage

2021-02-11 Thread frame via Digitalmars-d-learn

On Friday, 12 February 2021 at 02:22:35 UTC, H. S. Teoh wrote:

This turns the OP's O(n log n) algorithm into an O(n) 
algorithm, doesn't
need to copy the entire content of the file into memory, and 
also uses

much less memory by storing only hashes.


But this kind of hash is maybe insufficient to avoid hash 
collisions. For such big data slower but stronger algorithms like 
SHA are advisable.


Also associative arrays uses the same weak algorithm where you 
can run into collision issues. Thus using the hash from string 
data as key can be a problem. I always use a quick hash as key 
but hold actually a collection of hashes in them and do a lookup 
to be on the safe side.





Re: Trying to reduce memory usage

2021-02-11 Thread H. S. Teoh via Digitalmars-d-learn
On Fri, Feb 12, 2021 at 01:45:23AM +, mw via Digitalmars-d-learn wrote:
> On Friday, 12 February 2021 at 01:23:14 UTC, Josh wrote:
> > I'm trying to read in a text file that has many duplicated lines and
> > output a file with all the duplicates removed.
> 
> If you only need to remove duplicates, keep (and compare) a string
> hash for each line is good enough. Memory usage should be just n x
> integers.
[...]

+1. This can be even done on-the-fly: you don't even need to use .sort
or .uniq.  Just something like this:

bool[size_t] hashes;
foreach (line; stdin.byLine) {
auto h = hashOf(line); // use a suitable hash function here
if (h !in hashes) {
outfile.writeln(line);
hashes[h] = true;
}
// else this line already seen before; just skip it
}

This turns the OP's O(n log n) algorithm into an O(n) algorithm, doesn't
need to copy the entire content of the file into memory, and also uses
much less memory by storing only hashes.


T

-- 
MASM = Mana Ada Sistem, Man!


Re: Trying to reduce memory usage

2021-02-11 Thread mw via Digitalmars-d-learn

On Friday, 12 February 2021 at 01:23:14 UTC, Josh wrote:
I'm trying to read in a text file that has many duplicated 
lines and output a file with all the duplicates removed.


If you only need to remove duplicates, keep (and compare) a 
string hash for each line is good enough. Memory usage should be 
just n x integers.





Trying to reduce memory usage

2021-02-11 Thread Josh via Digitalmars-d-learn
I'm trying to read in a text file that has many duplicated lines 
and output a file with all the duplicates removed. By the end of 
this code snippet, the memory usage is ~5x the size of the infile 
(which can be multiple GB each), and when this is in a loop the 
memory usage becomes unmanageable and often results in an 
OutOfMemory error or just a complete lock up of the system. Is 
there a way to reduce the memory usage of this code without 
sacrificing speed to any noticeable extent? My assumption is the 
.sort.uniq needs improving, but I can't think of an easier/not 
much slower way of doing it.


Windows 10 x64
LDC - the LLVM D compiler (1.21.0-beta1):
  based on DMD v2.091.0 and LLVM 10.0.0

---

auto filename = "path\\to\\file.txt.temp";
auto array = appender!(string[]);
File infile = File(filename, "r");
foreach (line; infile.byLine) {
  array ~= line.to!string;
}
File outfile = File(stripExtension(filename), "w");
foreach (element; (array[]).sort.uniq) {
  outfile.myrawWrite(element ~ "\n"); // used to not print the \r 
on windows

}
outfile.close;
array.clear;
array.shrinkTo(0);
infile.close;

---

Thanks.


Re: how to properly compare this type?

2021-02-11 Thread Steven Schveighoffer via Digitalmars-d-learn

On 2/9/21 6:12 PM, Jack wrote:
static if(is(typeof(__traits(getMember, A, member)) == string 
function(string)))


That's not what you want. string function(string) is a *pointer* to a 
function that accepts a string and returns a string.


In addition to getting the overloads (you only get one "b" in the list 
of members), take the address of the overload. This worked for me:


foreach(overload; __traits(getOverloads, A, member))
static if(is(typeof(&overload) == string function(string)))
{
arr ~= member;
}

-Steve


Re: Real simple unresolved external symbols question...

2021-02-11 Thread Imperatorn via Digitalmars-d-learn

On Thursday, 11 February 2021 at 00:18:23 UTC, H. S. Teoh wrote:
On Wed, Feb 10, 2021 at 11:35:27PM +, WhatMeWorry via 
Digitalmars-d-learn wrote: [...]

Okay, thanks. Then why does the README.md at

https://github.com/dlang/druntime

say "Runtime is typically linked together with Phobos in a 
release such that the compiler only has to link to a single 
library to provide the user with the runtime and the standard 
library."


Probably outdated information.  Somebody should submit a PR for 
it. ;-)



T


Can someone in here update the docs?