How to get output of piped process?

2021-02-16 Thread Jedi via Digitalmars-d-learn

I an using pipeShell, I have redirected stdout, stderr, and stdin.

I am trying to read from the output and display it in my app. I 
have followed this code almost exactly except I use try wait and 
flush because the app is continuously updating the output. (it 
outputs a progress text on the same line and I'm trying to poll 
it to report to the user)



auto pipes = pipeProcess("my_application", Redirect.stdout | 
Redirect.stderr);

scope(exit) wait(pipes.pid);

// Store lines of output.
string[] output;
foreach (line; pipes.stdout.byLine) output ~= line.idup;

// Store lines of errors.
string[] errors;
foreach (line; pipes.stderr.byLine) errors ~= line.idup;


My code

auto p = pipeShell(`app.exe "`~f.name~`"`, Redirect.stdout | 
Redirect.stdin | Redirect.stderr);



while(!tryWait(p.pid).terminated)
{
string[] output;
foreach (line; p.stdout.byLine)
{
output ~= line.idup;
writeln(line);
}

string[] errors;
foreach (line; p.stderr.byLine)
{
errors ~= line.idup;
writeln("Err:"~line);
}
}

wait(p.pid);

None of this works though. What is strange is that when I close 
out the debugger the app starts working(no console output but I 
able to see that it is doing something) but is very slow.


auto p = executeShell(`app.exe "`~f.name~`"`);

Does work, except I have no output or input. I have another app 
that I do the exact same code and I can get the output and parse 
it, but this is after the app terminates. I imagine the issue 
here is that I'm trying to get the output while the app is 
running.



I want to be able to get the output so I can reduce much of the 
clutter and give a progress report. I am ok with simply hooking 
up the in and out of the console of the app to mine just as if I 
ran app.exe directly.


Re: Trying to reduce memory usage

2021-02-16 Thread tsbockman via Digitalmars-d-learn

On Wednesday, 17 February 2021 at 04:10:24 UTC, tsbockman wrote:
On files small enough to fit in RAM, it is similar in speed to 
the other solutions posted, but less memory hungry. Memory 
consumption in this case is around (sourceFile.length + 32 * 
lineCount * 3 / 2) bytes. Run time is similar to other posted 
solutions: about 3 seconds per GiB on my desktop.


Oops, I think the memory consumption should be (sourceFile.length 
+ 32 * (lineCount + largestBucket.lineCount / 2)) bytes. (In the 
limit where everything ends up in one bucket, it's the same, but 
that shouldn't normally happen unless the entire file has only 
one unique line in it.)


Re: Trying to reduce memory usage

2021-02-16 Thread tsbockman via Digitalmars-d-learn

On Friday, 12 February 2021 at 01:23:14 UTC, Josh wrote:
I'm trying to read in a text file that has many duplicated 
lines and output a file with all the duplicates removed. By the 
end of this code snippet, the memory usage is ~5x the size of 
the infile (which can be multiple GB each), and when this is in 
a loop the memory usage becomes unmanageable and often results 
in an OutOfMemory error or just a complete lock up of the 
system. Is there a way to reduce the memory usage of this code 
without sacrificing speed to any noticeable extent? My 
assumption is the .sort.uniq needs improving, but I can't think 
of an easier/not much slower way of doing it.


I spent some time experimenting with this problem, and here is 
the best solution I found, assuming that perfect de-duplication 
is required. (I'll put the code up on GitHub / dub if anyone 
wants to have a look.)


--
0) Memory map the input file, so that the program can pass 
around slices to it directly
without making copies. This also allows the OS to page it in and 
out of physical memory

for us, even if it is too large to fit all at once.

1) Pre-compute the required space for all large data 
structures, even if an additional pass is required to do so. This 
makes the rest of the algorithm significantly more efficient with 
memory, time, and lines of code.


2) Do a top-level bucket sort of the file using a small (8-16 
bit) hash into some scratch space. The target can be either in 
RAM, or in another memory-mapped file if we really need to 
minimize physical memory use.


The small hash can be a few bits taken off the top of a larger 
hash (I used std.digest.murmurhash). The larger hash is cached 
for use later on, to accelerate string comparisons, avoid 
unnecessary I/O, and perhaps do another level of bucket sort.


If there is too much data to put in physical memory all at once, 
be sure to copy the full text of each line into a region of the 
scratch file where it will be together with the other lines that 
share the same small hash. This is critical, as otherwise the 
string comparisons in the next step turn into slow random I/O.


3) For each bucket, sort, filter out duplicates, and write to 
the output file. Any sorting algorithm(s) may be used if all 
associated data fits in physical memory. If not, use a merge 
sort, whose access patterns won't thrash the disk too badly.


4) Manually release all large data structures, and delete the 
scratch file, if one was used. This is not difficult to do, since 
their life times are well-defined, and ensures that the program 
won't hang on to GiB of space any longer than necessary.

--

I wrote an optimized implementation of this algorithm. It's fast, 
efficient, and really does work on files too large for physical 
memory. However, it is complicated at almost 800 lines.


On files small enough to fit in RAM, it is similar in speed to 
the other solutions posted, but less memory hungry. Memory 
consumption in this case is around (sourceFile.length + 32 * 
lineCount * 3 / 2) bytes. Run time is similar to other posted 
solutions: about 3 seconds per GiB on my desktop.


When using a memory-mapped scratch file to accommodate huge 
files, the physical memory required is around 
max(largestBucket.data.length + 32 * largestBucket.lineCount * 3 
/ 2, bucketCount * writeBufferSize) bytes. (Virtual address space 
consumption is far higher, and the OS will commit however much 
physical memory is available and not needed by other tasks.) The 
run time is however long it takes the disk to read the source 
file twice, write a (sourceFile.length + 32 * lineCount * 3 / 2) 
byte scratch file, read back the scratch file, and write the 
destination file.


I tried it with a 38.8 GiB, 380_000_000 line file on a magnetic 
hard drive. It needed a 50.2 GiB scratch file and took about an 
hour (after much optimization and many bug fixes).


Re: is it posible to compile individual module separately?

2021-02-16 Thread Paul Backus via Digitalmars-d-learn

On Tuesday, 16 February 2021 at 17:49:42 UTC, Anonymouse wrote:

On Tuesday, 16 February 2021 at 17:26:06 UTC, Paul Backus wrote:

On Tuesday, 16 February 2021 at 17:15:25 UTC, Anonymouse wrote:
You can also use dub build --build-mode=singleFile, and it 
will compile one file at a time. It'll be slow but slow is 
better than OOM.


singleFile is for single-file packages [1]. The option you're 
thinking of is --build-mode=separate.


[1] https://dub.pm/advanced_usage.html#single-file


No, I do mean singleFile.

$ dub build --build-mode=singleFile --force
[...]


I stand corrected. Shouldn't have trusted the documentation so 
much, I guess.


Re: is it posible to compile individual module separately?

2021-02-16 Thread Anonymouse via Digitalmars-d-learn

On Tuesday, 16 February 2021 at 17:26:06 UTC, Paul Backus wrote:

On Tuesday, 16 February 2021 at 17:15:25 UTC, Anonymouse wrote:
You can also use dub build --build-mode=singleFile, and it 
will compile one file at a time. It'll be slow but slow is 
better than OOM.


singleFile is for single-file packages [1]. The option you're 
thinking of is --build-mode=separate.


[1] https://dub.pm/advanced_usage.html#single-file


No, I do mean singleFile.

$ dub build --build-mode=singleFile --force
Performing "debug" build using /usr/local/bin/ldc2 for x86_64.
arsd-official:characterencodings 9.1.2: building configuration 
"library"...
Compiling 
../../.dub/packages/arsd-official-9.1.2/arsd-official/characterencodings.d...

Linking...
arsd-official:dom 9.1.2: building configuration "library"...
Compiling 
../../.dub/packages/arsd-official-9.1.2/arsd-official/dom.d...

Linking...
lu 1.1.2: building configuration "library"...
Compiling ../../.dub/packages/lu-1.1.2/lu/source/lu/common.d...
Compiling ../../.dub/packages/lu-1.1.2/lu/source/lu/container.d...
Compiling ../../.dub/packages/lu-1.1.2/lu/source/lu/conv.d...
Compiling 
../../.dub/packages/lu-1.1.2/lu/source/lu/deltastrings.d...

Compiling ../../.dub/packages/lu-1.1.2/lu/source/lu/json.d...
Compiling ../../.dub/packages/lu-1.1.2/lu/source/lu/meld.d...
Compiling ../../.dub/packages/lu-1.1.2/lu/source/lu/numeric.d...
Compiling ../../.dub/packages/lu-1.1.2/lu/source/lu/objmanip.d...
Compiling ../../.dub/packages/lu-1.1.2/lu/source/lu/package.d...
Compiling ../../.dub/packages/lu-1.1.2/lu/source/lu/semver.d...
Compiling 
../../.dub/packages/lu-1.1.2/lu/source/lu/serialisation.d...

Compiling ../../.dub/packages/lu-1.1.2/lu/source/lu/string.d...
Compiling ../../.dub/packages/lu-1.1.2/lu/source/lu/traits.d...
Compiling ../../.dub/packages/lu-1.1.2/lu/source/lu/typecons.d...
Compiling ../../.dub/packages/lu-1.1.2/lu/source/lu/uda.d...
Linking...
dialect 1.1.1: building configuration "library"...
Compiling 
../../.dub/packages/dialect-1.1.1/dialect/source/dialect/common.d...
Compiling 
../../.dub/packages/dialect-1.1.1/dialect/source/dialect/defs.d...
Compiling 
../../.dub/packages/dialect-1.1.1/dialect/source/dialect/package.d...
Compiling 
../../.dub/packages/dialect-1.1.1/dialect/source/dialect/parsing.d...
Compiling 
../../.dub/packages/dialect-1.1.1/dialect/source/dialect/postprocessors/package.d...
Compiling 
../../.dub/packages/dialect-1.1.1/dialect/source/dialect/postprocessors/twitch.d...
Compiling 
../../.dub/packages/dialect-1.1.1/dialect/source/dialect/semver.d...

Linking...
cachetools 0.3.1: building configuration "library"...
Compiling 
../../.dub/packages/cachetools-0.3.1/cachetools/source/cachetools/cache.d...
Compiling 
../../.dub/packages/cachetools-0.3.1/cachetools/source/cachetools/cache2q.d...
Compiling 
../../.dub/packages/cachetools-0.3.1/cachetools/source/cachetools/cachelru.d...
Compiling 
../../.dub/packages/cachetools-0.3.1/cachetools/source/cachetools/containers/hashmap.d...

^C


Re: is it posible to compile individual module separately?

2021-02-16 Thread Paul Backus via Digitalmars-d-learn

On Tuesday, 16 February 2021 at 17:15:25 UTC, Anonymouse wrote:
You can also use dub build --build-mode=singleFile, and it will 
compile one file at a time. It'll be slow but slow is better 
than OOM.


singleFile is for single-file packages [1]. The option you're 
thinking of is --build-mode=separate.


[1] https://dub.pm/advanced_usage.html#single-file


Re: is it posible to compile individual module separately?

2021-02-16 Thread Anonymouse via Digitalmars-d-learn

On Tuesday, 16 February 2021 at 17:06:21 UTC, evilrat wrote:

On Tuesday, 16 February 2021 at 07:01:53 UTC, bokuno_D wrote:

i run "dub build" on it. but OOM kill the compiler.
-
is there a way to reduce memory consumtion of the compiler?
or maybe third party tool? alternative to dub?


Assuming you are using DMD, there is -lowmem switch to enable 
garbage collection (it is off by default for faster builds)


open dub.json, add dflags array with -lowmem, something like 
this line:


   "dflags": [ "-lowmem" ],


Ideally this would work, but 
https://issues.dlang.org/show_bug.cgi?id=20699. Does work with 
ldc though.


You can also use dub build --build-mode=singleFile, and it will 
compile one file at a time. It'll be slow but slow is better than 
OOM.


Re: is it posible to compile individual module separately?

2021-02-16 Thread evilrat via Digitalmars-d-learn

On Tuesday, 16 February 2021 at 07:01:53 UTC, bokuno_D wrote:

i run "dub build" on it. but OOM kill the compiler.
-
is there a way to reduce memory consumtion of the compiler?
or maybe third party tool? alternative to dub?


Assuming you are using DMD, there is -lowmem switch to enable 
garbage collection (it is off by default for faster builds)


open dub.json, add dflags array with -lowmem, something like this 
line:


   "dflags": [ "-lowmem" ],

then build normally, if you have gdc or ldc dub might pick first 
compiler in %PATH%, compiler can be selected with --compiler 
option


   dub build --compiler=dmd


Re: Fastest way to "ignore" elements from array without removal

2021-02-16 Thread Steven Schveighoffer via Digitalmars-d-learn

On 2/16/21 1:03 AM, H. S. Teoh wrote:

For the former, you can use the read-head/write-head algorithm: keep two
indices as you iterate over the array, say i and j: i is for reading
(incremented every iteration) and j is for writing (not incremented if
array[i] is to be deleted).  Each iteration, if j < i, copy array[i] to
array[j].  At the end of the loop, assign the value of j to the length
of the array.


std.algorithm.mutation.remove does this for you.

It's just a bit awkward as it doesn't do this based on values, you have 
to pass a lambda.


auto removed = arr.remove!(v => v == target); // removed is now the 
truncated array


And if you don't care about preserving the order, it can be done faster:

auto removed = arr.remove!(v => v == target, SwapStrategy.unstable);

-Steve


Re: Fastest way to "ignore" elements from array without removal

2021-02-16 Thread Paul Backus via Digitalmars-d-learn

On Tuesday, 16 February 2021 at 09:08:33 UTC, z wrote:
Does filter support multiple arguments for the predicate?(i.e. 
using a function that has a "bool function(T1 a, T2 b)" 
prototype)


I am not sure exactly what you are asking here, but you can 
probably accomplish what you want by combining filter with 
std.range.chunks or std.range.slide.


http://phobos.dpldocs.info/std.range.chunks.html
http://phobos.dpldocs.info/std.range.slide.html

If not could still implement the function inside the loop but 
that would be unwieldy.
And does it create copies every call? this is important because 
if i end up using .filter it will be called a 6 to 8 digit 
number of times.


filter does not create any copies of the original array. The same 
is true for pretty much everything in std.range and std.algorithm.


Re: Constructor called instead of opAssign()

2021-02-16 Thread frame via Digitalmars-d-learn
On Tuesday, 16 February 2021 at 09:04:43 UTC, Boris Carvajal 
wrote:


I don't think this is intended rather it appears to be a 
bug/deficiency in the constructor flow analysis of DMD, which 
from what I'm reading is very rudimentary.


If I'm using a delegate in B, supplied to super() and called in 
A, then it works :P


Re: Fastest way to "ignore" elements from array without removal

2021-02-16 Thread z via Digitalmars-d-learn

On Tuesday, 16 February 2021 at 06:03:50 UTC, H. S. Teoh wrote:
It depends on what your goal is.  Do you want to permanently 
remove the items from the array?  Or only skip over some items 
while iterating over it?  For the latter, see 
std.algorithm.iteration.filter.
The array itself is read only, so it'll have to be an array of 
pointers/indexes.


For the former, you can use the read-head/write-head algorithm: 
keep two indices as you iterate over the array, say i and j: i 
is for reading (incremented every iteration) and j is for 
writing (not incremented if array[i] is to be deleted).  Each 
iteration, if j < i, copy array[i] to array[j].  At the end of 
the loop, assign the value of j to the length of the array.


Example:

int[] array = ...;
size_t i=0, j=0;
while (i < array.length)
{
doSomething(array[i]);
if (!shouldDelete(array[i]))
j++;
if (j < i)
array[j] = array[i];
i++;
}
array.length = j;

Basically, the loop moves elements up from the back of the 
array on top of elements to be deleted.  This is done in tandem 
with processing each element, so it requires only traversing 
array elements once, and copies array elements at most once for 
the entire loop.  Array elements are also read / copied 
sequentially, to maximize CPU cache-friendliness.



T


This is most likely ideal for what i'm trying to 
do.(resizes/removes will probably have to propagate to other 
arrays)
The only problem is that it does not work with the first element 
but i could always just handle the special case on my own.[1]
I'll probably use .filter or an equivalent for an initial first 
pass and this algorithm for the rest, thank you both!


[1] https://run.dlang.io/is/f9p29A (the first element is still 
there, and the last element is missing. both occur if the first 
element didn't pass the check.)


Re: Fastest way to "ignore" elements from array without removal

2021-02-16 Thread z via Digitalmars-d-learn

On Tuesday, 16 February 2021 at 04:43:33 UTC, Paul Backus wrote:

On Tuesday, 16 February 2021 at 04:20:06 UTC, z wrote:
What would be the overall best manner(in ease of 
implementation and speed) to arbitrarily remove an item in the 
middle of an array while iterating through it?


http://phobos.dpldocs.info/std.algorithm.iteration.filter.html


Does filter support multiple arguments for the predicate?(i.e. 
using a function that has a "bool function(T1 a, T2 b)" prototype)
If not could still implement the function inside the loop but 
that would be unwieldy.
And does it create copies every call? this is important because 
if i end up using .filter it will be called a 6 to 8 digit number 
of times.


Re: Constructor called instead of opAssign()

2021-02-16 Thread Boris Carvajal via Digitalmars-d-learn

On Sunday, 14 February 2021 at 08:46:34 UTC, frame wrote:

The first instance is in A - and why opAssign then works there?


Sorry I didn't pay too much attention.

It seems the detection of first assignment only happens when the 
field and constructor have the same parent, so it doesn't work 
either if the field is from a base or derived class (your case by 
means of casting 'this').


I don't think this is intended rather it appears to be a 
bug/deficiency in the constructor flow analysis of DMD, which 
from what I'm reading is very rudimentary.