Very cool. I likely wont get to this for a few days.
From: Jafar Al-Gharaibeh <[email protected]>
To: David Gamey <[email protected]>
Cc: Unicon group <[email protected]>
Sent: Wednesday, January 21, 2015 4:58 PM
Subject: Re: [Unicon-group] Walk of file directory
Here is a slightly tweaked/reformatted version. It now by default auto-detect
the number of available cores in the machine and launch twice as many threads.
--Jafar
On Wed, Jan 21, 2015 at 12:17 PM, Jafar Al-Gharaibeh <[email protected]> wrote:
David,
I added a threaded solution @
http://rosettacode.org/wiki/Walk_a_directory/Recursively#Icon_and_Unicon
Please review/edit as you see fit. (The source file is attached). Combining
recursion with thread might not be the best solution for this problem. If I
were to put this in real use I'd go with an iterative approach using
master/workers model. Anyway, this is a excellent demonstration on how to use
threads!. The key features are:
1- How to create threads, limit their numbers, self-load balanced (new
threads are spawned at the time/place where needed. One they are done, they
vanish allowing new threads to pop up in new places in the directory structure)
2- pass data and collect results to/from the threads using the new language
features.
Here is some sample output from my desktop machine (quad-core with mechanical
HDD. I will try another machine with an SSD and see if more threads scale
better).
the first argument to the program is the target directory. The second is the
maximum number of concurrent threads to use at any given moment. (soft limit!
my counters are "unmutexed", so the actual number might deviate). Note that
this is different from the actual number of threads used during the run which
is reported at the end. The program can create/destroy threads as needed, but
cannot use more than "max" # of threads at any given moment, and again "max"
is "soft". :)
Cheers,Jafar
c:\proj>tdir c:\ 139708 directories in 99867 ms using 1 threads
c:\proj>tdir c:\ 439708 directories in 62222 ms using 4 threads
c:\proj>tdir c:\ 4
39708 directories in 87650 ms using 4 threads
c:\proj>tdir c:\ 139708 directories in 92525 ms using 1 threads
c:\proj>tdir c:\ 439708 directories in 95655 ms using 4 threads
c:\proj>tdir c:\ 1639708 directories in 66138 ms using 21 threads
c:\proj>tdir c:\ 839708 directories in 69307 ms using 8 threads
c:\proj>tdir c:\ 439708 directories in 70539 ms using 4 threads
c:\proj>tdir c:\ 1639708 directories in 76392 ms using 32 threads
On Sun, Jan 11, 2015 at 1:25 PM, David Gamey <[email protected]> wrote:
Sergey,
I am responsible for much of the Rosetta code contributions (thanks also to
Steve, Andrew, Matt, Peter, and about 4 others) and this one in particular
dating from 2010. As I recall this was before the multi-threading versions were
widely available. I think multi-threading is underrepresented in Rosetta/Unicon.
If you come up with a multi-threading version, we should add it to the post as
an alternative version. If you don't feel comfortable doing this, post the
code and I can add it.
David
From: Sergey Logichev <[email protected]>
To: Jafar Al-Gharaibeh <[email protected]>
Cc: Unicon group <[email protected]>
Sent: Sunday, January 11, 2015 1:16 AM
Subject: Re: [Unicon-group] Walk of file directory
Jafar, Thank you for a whole bundle of advices and suggestions! Threads are
worth to try. The thought of search by file attributes is very useful too. Your
suggestion about slow I/O partly is right. For UNIX I tried the program on
Raspberry Pi with 6 Class microSD as HDD (it's slow, agree). But for Windows it
was quite fast HDD. It would be interesting to compare performance of the
program on Windows with classic approach based on Win32 _FINDFIRST, _FINDNEXT
functions. I have threaded Delphi/Lazarus implementations of this algorithm.
Feel that it will be faster but in which degree? Sergey 10.01.2015, 21:50,
"Jafar Al-Gharaibeh" <[email protected]>:
Sergey, There are so many things that came to mind when I saw your program.
1- At the end of your email, sourceforge ad says "Go Parallel", Which is not a
bad idea for this highly parallel application. There is a similar program
"wordcount" listed in my dissertation (available on unicon.org) that go through
directories and count words in every file using threads (Chapter 7, page 107)
2- Unicon open() already supports " pattern matching that would greatly (I
believe) speedup your program. For example you can do this: L :=
open("*.icn") to get a list of all of Unicon source files in the current
directory. Note: It would be nice if there were a way to tell open() to
return files not only based on a pattern, but also on file attribute to allow
something like "get me all directories in the current directory", or "get me
all read only file". There are a lot of situations where filtering directory
names for example is very useful - like this program 3- The program on Rosetta
Code is not optimized for speed. You can minimize the number of lists created
and put() by careful rewriting of the code. 4- Depending on how deep the
directory tree is, there might be a lot of I/O going on. A slow disk might
limit how fast you can go regardless of how optimized your code is. I will
share results if get around trying any of these options. Cheers,Jafar
On Sat, Jan 10, 2015 at 5:51 AM, Sergey Logichev <[email protected]> wrote:
Hello all! Now I investigate the best approach to get list of files in
specified directory and beneath in Unicon.I found excellent example at
rosettacode.org:
http://rosettacode.org/wiki/Walk_a_directory/Recursively#Icon_and_Unicon I
reconstructed this one to implement matching of filenames to specified pattern
(regular expression). My program recursively walks a directory and prints
appropriate filenames. The same as dir (ls) does. All working fine except
performance. If directory has a lot of subdirs the search may took 10-20
seconds before starting output. Could you provide some advices how to enchance
the performance? Some notes how to make and use. Unpack content of udir.zip to
your local directory. Define which environment you use in env.icn file -
uncomment line "$define _UNIX 1" in the case of UNIX. Nothing to do in the case
of Windows.Make udir program:unicon -c futils.icnunicon -c options.icnunicon -c
regexp.icnunicon udir.icn Usage: udir -f<filemask>for example: udir
-f*.icnshall list of icn files in the current dir and all its subdirectories.
Best regards,Sergey Logichev
------------------------------------------------------------------------------
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
_______________________________________________
Unicon-group mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/unicon-group
------------------------------------------------------------------------------
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
_______________________________________________
Unicon-group mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/unicon-group
------------------------------------------------------------------------------
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet
_______________________________________________
Unicon-group mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/unicon-group
------------------------------------------------------------------------------
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet
_______________________________________________
Unicon-group mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/unicon-group