Avoid if statements for checking neighboring indexes in a 2D array
My goal is to find connected components in a 2D array for example finding connected '*' chars below. x x x x x x x x x x x x x x * * x x x x * * x x x x x * * x * x x x x x There are two connected '*' group in this example. First group is composes of six '*' located closer to middle and the second group composes only one '*' char located in the left bottom corner. Do to this I generally implement a recursive algorithm which repeat calling the same function by checking all neighbors around the current index. I generally end up with something like : void foo( int row, int col) { //Do something here like caching the index if ( twoDimensionData[row - 1][col] == '*') foo(row- 1, col); else if ( twoDimensionData[row + 1][col] == '*') foo(row+ 1, col); else if ( twoDimensionData[row - 1 ][col - 1] == '*') foo(row - 1, col - 1); //. I need 5 more of this bad boys I mean if checks } Is there any better way to achieve this with cool std functions like enumerate or iota without needing to write eight if checks?
Re: Avoid if statements for checking neighboring indexes in a 2D array
On Sunday, 16 July 2017 at 10:37:39 UTC, kerdemdemir wrote: My goal is to find connected components in a 2D array for example finding connected '*' chars below. x x x x x x x x x x x x x x * * x x x x * * x x x x x * * x * x x x x x There are two connected '*' group in this example. First group is composes of six '*' located closer to middle and the second group composes only one '*' char located in the left bottom corner. Do to this I generally implement a recursive algorithm which repeat calling the same function by checking all neighbors around the current index. I generally end up with something like : void foo( int row, int col) { //Do something here like caching the index if ( twoDimensionData[row - 1][col] == '*') foo(row- 1, col); else if ( twoDimensionData[row + 1][col] == '*') foo(row+ 1, col); else if ( twoDimensionData[row - 1 ][col - 1] == '*') foo(row - 1, col - 1); //. I need 5 more of this bad boys I mean if checks } Is there any better way to achieve this with cool std functions like enumerate or iota without needing to write eight if checks? What you probably want is a convolution, used a lot in image processing. insted of using recursion you walk left to right in blocks of 3x3 and compute a "sum" and then do the same vertically, then each cell contains the number of neighbours that are *. In this case you want the kernel 111 101 111 o o o x x x o o o x x x o o # * x x x x * * x x x x x * * x * x x x x x x o o o x x x o o o x x x o # # x x x x * * x x x x x * * x * x x x x x x x o o o x x x o o o x x x # # o x x x * * x x x x x * * x * x x x x x x x x o o o x x x o o o x x * # o o x x * * x x x x x * * x * x x x x x Have a look at the video on http://halide-lang.org describing the different methods used.
Re: Avoid if statements for checking neighboring indexes in a 2D array
On Sunday, 16 July 2017 at 10:37:39 UTC, kerdemdemir wrote: My goal is to find connected components in a 2D array for example finding connected '*' chars below. x x x x x x x x x x x x x x * * x x x x * * x x x x x * * x * x x x x x ... Is there any better way to achieve this with cool std functions like enumerate or iota without needing to write eight if checks? I don't know of a library facility to do this. Still, there is a language-agnostic way to make it more concise. Instead of repeating eight similar blocks, define an array of delta-rows and delta-columns to neighboring cells, and use that array in a loop. A complete example follows: - import std.algorithm, std.array, std.range, std.stdio; immutable int dirs = 8; immutable int [dirs] dRow = [-1, -1, -1, 0, +1, +1, +1, 0]; immutable int [dirs] dCol = [-1, 0, +1, +1, +1, 0, -1, -1]; char [] [] arr; int componentSizeRecur (int row, int col) { int res = 1; arr[row][col] = 'x'; foreach (dir; 0..dirs) { auto nRow = row + dRow[dir]; auto nCol = col + dCol[dir]; if (arr[nRow][nCol] == '*') res += componentSizeRecur (nRow, nCol); } return res; } void main () { arr = ["xxx", "*xx", "xx**xxx", "xx**x*x", "xxx", ].map !(line => line.dup).array; foreach (row; 0..arr.length) foreach (col; 0..arr.front.length) if (arr[row][col] == '*') writeln (componentSizeRecur (row, col)); } - If the neighbors array is regular and known in advance (like, 4 edge-connected cells, or 8 corner-connected cells as here), you may also like to loop over possible deltas and pick the good ones, like below: - int componentSizeRecur (int row, int col) { int res = 1; arr[row][col] = 'x'; foreach (dRow; -1..+2) foreach (dCol; -1..+2) if (dRow || dCol) { auto nRow = row + dRow; auto nCol = col + dCol; if (arr[nRow][nCol] == '*') res += componentSizeRecur (nRow, nCol); } return res; } - I have to make two additional notes. 1. This works only if the border does not contain '*' characters. To make it work without that restriction, either add two sentinel rows and columns at the four borders of the array, or put an if on nRow and nCol before using them. 2. The recursive solution can eat up lots of stack. If you intend using it on large arrays, make sure you don't hit the stack size limit of the environment. On Windows, it can be achieved by a compiler switch like "-L/STACK:268435456". On Linux, the "ulimit" command may help. Ivan Kazmenko.
Re: Avoid if statements for checking neighboring indexes in a 2D array
On 16.07.2017 12:37, kerdemdemir wrote: My goal is to find connected components in a 2D array for example finding connected '*' chars below. x x x x x x x x x x x x x x * * x x x x * * x x x x x * * x * x x x x x There are two connected '*' group in this example. First group is composes of six '*' located closer to middle and the second group composes only one '*' char located in the left bottom corner. Do to this I generally implement a recursive algorithm which repeat calling the same function by checking all neighbors around the current index. I generally end up with something like : void foo( int row, int col) { //Do something here like caching the index if ( twoDimensionData[row - 1][col] == '*') foo(row- 1, col); else if ( twoDimensionData[row + 1][col] == '*') foo(row+ 1, col); else if ( twoDimensionData[row - 1 ][col - 1] == '*') foo(row - 1, col - 1); //. I need 5 more of this bad boys I mean if checks } ... It is wrong to explore in only one direction, so I assume you do not mean "else". Is there any better way to achieve this foreach(i;row-1..row+2){ foreach(j;col-1..col+2){ if(i==row && j==col) continue; if(twoDimensionData[i][j] == '*') foo(row,col); } } with cool std functions like enumerate or iota without needing to write eight if checks? cartesianProduct(iota(row-1,row+2),iota(col-1,col+2)) .filter!(a=>(a[0]!=row||a[1]!=col)) .filter!(a=>twoDimensionData[a[0]][a[1]]=='*') .each!(a=>foo(a.expand)); (You can usually drop the first filter because "doing something" will usually involve checking if the node has been visited and returning or else marking the node as visited.)
Too slow readln
Hello, there! I have the next "grep" code: https://dpaste.dzfl.pl/7b7273f96ab2 And I have the directory to run it: $ time /home/undefer/MyFiles/Projects/TEST/D/grep "HELLO" . ./strace.log: [pid 18365] write(1, "HELLO\n", 6HELLO real1m17.096s user0m54.828s sys 0m13.340s The same result I get with ldc2.. The same with bash and grep: $ time for i in `find .`; do file -b "$i" | grep -q text && grep -a "HELLO" "$i"; done [pid 18365] write(1, "HELLO\n", 6HELLO real0m42.461s user0m23.244s sys 0m22.300s Only `file` for all files: $ time find . -exec file {} + >/dev/null real0m15.013s user0m14.556s sys 0m0.436s Only grep for all files: $ for i in `find .`; do file -b "$i" | grep -q text && echo "$i"; done > LIST1 $ time for i in `cat LIST1`; do grep -a "HELLO" "$i"; done [pid 18365] write(1, "HELLO\n", 6HELLO real0m4.431s user0m1.112s sys 0m3.148s So 15+4.4 much lesser than 42.46. Why? How "find" so fast can run "file" so many times? And why 42.461s much lesser 1m17.096s? The second version of grep: https://dpaste.dzfl.pl/9db5bc2f0a26 $ time /home/undefer/MyFiles/Projects/TEST/D/grep2 "HELLO" `cat LIST1` ./strace.log: [pid 18365] write(1, "HELLO\n", 6HELLO real0m1.871s user0m1.824s sys 0m0.048s $ time grep -a "HELLO" `cat LIST1` ./strace.log:[pid 18365] write(1, "HELLO\n", 6HELLO real0m0.075s user0m0.044s sys 0m0.028s The profiler says that readln eats CPU. So why 0m0.075s much lesser 0m1.871s? How to write in D grep not slower than GNU grep?
Re: Avoid if statements for checking neighboring indexes in a 2D array
On 16.07.2017 18:55, Timon Gehr wrote: On 16.07.2017 12:37, kerdemdemir wrote: My goal is to find connected components in a 2D array for example finding connected '*' chars below. x x x x x x x x x x x x x x * * x x x x * * x x x x x * * x * x x x x x There are two connected '*' group in this example. First group is composes of six '*' located closer to middle and the second group composes only one '*' char located in the left bottom corner. Do to this I generally implement a recursive algorithm which repeat calling the same function by checking all neighbors around the current index. I generally end up with something like : void foo( int row, int col) { //Do something here like caching the index if ( twoDimensionData[row - 1][col] == '*') foo(row- 1, col); else if ( twoDimensionData[row + 1][col] == '*') foo(row+ 1, col); else if ( twoDimensionData[row - 1 ][col - 1] == '*') foo(row - 1, col - 1); //. I need 5 more of this bad boys I mean if checks } ... It is wrong to explore in only one direction, so I assume you do not mean "else". Is there any better way to achieve this foreach(i;row-1..row+2){ foreach(j;col-1..col+2){ if(i==row && j==col) continue; if(twoDimensionData[i][j] == '*') foo(row,col); } } with cool std functions like enumerate or iota without needing to write eight if checks? cartesianProduct(iota(row-1,row+2),iota(col-1,col+2)) .filter!(a=>(a[0]!=row||a[1]!=col)) .filter!(a=>twoDimensionData[a[0]][a[1]]=='*') .each!(a=>foo(a.expand)); (You can usually drop the first filter because "doing something" will usually involve checking if the node has been visited and returning or else marking the node as visited.) Ivan's example in this style: import std.stdio, std.range, std.algorithm, std.array; char[][] arr; int componentSize(size_t row, size_t col){ if(row>=arr.length||col>=arr[row].length||arr[row][col]!='*') return 0; arr[row][col]='x'; return 1+cartesianProduct(iota(row-1,row+2),iota(col-1,col+2)) .map!(a=>componentSize(a.expand)).sum; } void main (){ arr=["xx*", "*xx", "xx**xxx", "xx**x**", "xxx"].map!dup.array; cartesianProduct(iota(arr.length),iota(arr[0].length)) .filter!(a=>arr[a[0]][a[1]]=='*') .each!(a=>writeln(componentSize(a.expand))); } (This works even if there are * at the border.)
Re: Too slow readln
On Sunday, 16 July 2017 at 17:03:27 UTC, unDEFER wrote: [snip] How to write in D grep not slower than GNU grep? GNU grep is pretty fast, it's tough to beat it reading one line at a time. That's because it can play a bit of a trick and do the initial match ignoring line boundaries and correct line boundaries later. There's a good discussion in this thread ("Why GNU grep is fast" by Mike Haertel): https://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html --Jon
Re: Avoid if statements for checking neighboring indexes in a 2D array
On 16.07.2017 19:10, Timon Gehr wrote: ... (This works even if there are * at the border.) Well, not really. :) Version that actually works if there are * at the border: import std.stdio, std.range, std.algorithm, std.array; char[][] arr; int componentSize(int row,int col){ if(row>=arr.length||col>=arr[row].length||arr[row][col]!='*') return 0; arr[row][col]='x'; return 1+cartesianProduct(iota(row-1,row+2),iota(col-1,col+2)) .map!(a=>componentSize(a.expand)).sum; } void main (){ arr=["***", "*xx", "xx**xxx", "xxx*x**", "**x"].map!dup.array; cartesianProduct(iota(cast(int)arr.length),iota(cast(int)arr[0].length)) .filter!(a=>arr[a[0]][a[1]]=='*') .each!(a=>writeln(componentSize(a.expand))); }
Re: Too slow readln
On Sunday, 16 July 2017 at 17:37:34 UTC, Jon Degenhardt wrote: On Sunday, 16 July 2017 at 17:03:27 UTC, unDEFER wrote: [snip] How to write in D grep not slower than GNU grep? GNU grep is pretty fast, it's tough to beat it reading one line at a time. That's because it can play a bit of a trick and do the initial match ignoring line boundaries and correct line boundaries later. There's a good discussion in this thread ("Why GNU grep is fast" by Mike Haertel): https://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html --Jon Thank you. I understand yet another trick: $ find . -exec file -bi {} + is the same $ file -bi `find .`
Re: Too slow readln
I understand the main problem. dirEntries by default follows symlinks. Without it my first grep works only 28.338s. That really cool!
Re: Add property setting to chain
Thanks for reply, Ali.
Re: Avoid if statements for checking neighboring indexes in a 2D array
Hi Guys, @Nicholas , thanks a lot for cool solution but actually I weren't working on image processing. I was trying to solve "http://codeforces.com/contest/828/problem/B";. I really needed finding connected components this time. @Ivan, your solution is much more elegant than what I did. But I find @Timon's solution with cartesian product a bit nicer in this case since I love to see std function more and more. Thanks guys for all your advises. D community is really the best. Here is my solution to question. It seems I didn't get it working completely yet. In my debugger(Msvc MonoD) even there are many rows it seems Recurse function only loops the columns in the first row. And debugger is jumping so strangely I couldn't tag the problem. But I don't expect a big change there should be a small bug that is it. Sorry if code contains some foreign words I just replaced many variable names from my native language I might be missing some. import std.stdio; import std.string; import std.algorithm; import std.conv; import std.array; import std.range; import std.math; int totalrow; int totalcolumn; dchar[][] twoDimensionArray; struct ConnectedElementsSolver { this( dchar[][] twoDimArray ) { m_twoDimStruct = twoDimArray; Recurse(0, 0); } void Recurse ( int row, int column ) { if( row < 0 || column < 0 ) return; for ( ; row < m_twoDimStruct.length ; row++ ) { for ( ; column < m_twoDimStruct[row].length; column++ ) { Process( row, column, m_twoDimStruct.length, m_twoDimStruct[row].length ); } } } void Process( int row, int column, ulong maxrow, ulong maxcolumn ) { if( row < 0 || column < 0 || row >= maxrow || column >= maxcolumn ) return; if ( m_twoDimStruct[row][column] == 'B' ) { m_twoDimStruct[row][column] = 'W'; m_tempResult.Process(row, column ); Process(row-1,column-1, maxrow, maxcolumn); Process(row,column-1, maxrow, maxcolumn); Process(row+1,column-1, maxrow, maxcolumn); Process(row-1,column, maxrow, maxcolumn); Process(row+1,column, maxrow, maxcolumn); Process(row-1,column+1, maxrow, maxcolumn); Process(row,column+1, maxrow, maxcolumn); Process(row-1,column+1, maxrow, maxcolumn); } else { if ( m_tempResult.HowManyFilled ) m_results ~= m_tempResult; m_tempResult.Resetle(); } } SquareCandidate m_tempResult; SquareCandidate[] m_results; dchar[][] m_twoDimStruct; } struct SquareCandidate { int MaxY; int MinY; int MaxX; int MinX; int HowManyFilled; this( int howManyFilled ) { HowManyFilled = howManyFilled; } void Resetle() { this = SquareCandidate(); } void Process( int row, int column ) { HowManyFilled++; MaxY = max( column, MaxY); MinY = min( column, MinY); MaxX = max( row, MaxX); MinX = min( row, MinX); } int FindEmptySlots() { int kareKenarUzunlugu = max(MaxX-MinX, MaxY-MinY); int kareAlani = kareKenarUzunlugu*kareKenarUzunlugu; return kareAlani - HowManyFilled; } bool CanCreateSquare( int xMax, int yMax ) { int xUzunlugu = MaxX-MinX; int yUzunlugu = MaxY-MinY; if ( xUzunlugu > yUzunlugu ) { return yMax >= xUzunlugu; } else { return xMax >= yUzunlugu; } } } void main() { auto dimensions = stdin.readln.strip.split().map!(a => to!int(a)).array(); totalrow = dimensions[0]; totalcolumn = dimensions[1]; twoDimensionArray = stdin .byLine() .take(totalrow) .map!(line => line .map!(a => to!dchar(a)) .array()) .array; ConnectedElementsSolver baglantiliElemCozucu = ConnectedElementsSolver(twoDimensionArray); bool isAnyProblemMakingSquare = baglantiliElemCozucu.m_results.any!(a => a.Ca
Re: Avoid if statements for checking neighboring indexes in a 2D array
Of course now I will try to have it work first. Than replace for loops with Cartesian product calls. Than I will make 2D array template and even maybe with random access range. And finally for being able to use this class later in the some other coding challenge I will make Searching( == 'W' part) and Process functions passed by parameter. Thanks a lot for help. Erdem
Re: stacktrace for InvalidMemoryOperationError
On Saturday, 15 July 2017 at 18:14:13 UTC, Joakim wrote: core.exception.InvalidMemoryOperationError@src/core/exception.d(696): Invalid memory operation ... See the wiki page about this: https://wiki.dlang.org/InvalidMemoryOperationError If you can't do all that, look for places you might be allocating in a destructor. The recent GC allocation flag -vgc might help: https://dlang.org/blog/2017/06/16/life-in-the-fast-lane/ Yes, I found it already and make documentation proposition about https://issues.dlang.org/show_bug.cgi?id=17642 Then it turned out that it was more difficult to avoid this problem than to find the cause. Yes, I have a heavy cleanup in the destructor and I can't definitely make this cleanup @nogc. So first I tried to make problematic object instances RefCounted. In my case, it means change ClientController[string] to RefCounted!ClientController[string] and create an object by lib interface, not by new. I have to change ClientController from object to struct because automem can't work with the object. I hunted all extra copies of these objects and kill them all, so now I sure there are no copies except the main container. I tried different libraries from std.typecons.RefCounted to automem. And I fail to make it _really_ reference counted. It's not deleted when I delete it from the container. Every time it's collected by GC and application is dropped. So I decided that in the container itself. Obviously, the associative array does not delete the elements themselves on demand, but loses them and gives them to the garbage collector. I didn't find any map or multimap @nogc container so I make the ugly solution: I move all destruction to method destructMe() and in ~this() now is just assert(destructMeIsCalled). And now I have to call manually destructMe() before object removing. It seems it works. I'm wondering if there is a less ugly way to have the map of reference counted objects.
Re: Too slow readln
On 07/16/2017 10:37 AM, Jon Degenhardt wrote: There's a good discussion in this thread ("Why GNU grep is fast" by Mike Haertel): https://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html --Jon Another fast GNU utility was on Reddit a month ago: https://www.reddit.com/r/programming/comments/6gxf02/how_is_gnus_yes_so_fast_xpost_runix/ Ali
Re: How to initialize the Variant variable with the default value using TypeInfo?
On 07/15/2017 09:27 AM, RedCAT wrote: > I have only TypeInfo taken dynamically from an arbitrary type, and I > need to initialize the Variant variable to the default value for this > type. I'm afraid there is no way of setting the value of Variant dynamically because it wants to do type-checking at compile time. You should be able to use a switch-case statement yourself because presumably you have the set of valid types: // (Not compiled.) switch (ti) { case typeid(int): o = v.get!int; > TypeInfo ti = typeid(int); > Variant v = ti.initializer; TypeInfo.initializer returns an array of bytes (no type clue there): /** * Return default initializer. If the type should be initialized to all * zeros, an array with a null ptr and a length equal to the type size will * be returned. For static arrays, this returns the default initializer for * a single element of the array, use `tsize` to get the correct size. */ abstract const(void)[] initializer() nothrow pure const @safe @nogc; Ali