Re: [Unicon-group] Walk of file directory

Jafar Al-Gharaibeh Wed, 21 Jan 2015 13:59:45 -0800

Here is a slightly tweaked/reformatted version. It now by default
auto-detect the number of available cores in the machine and launch twice
as many threads.


--Jafar

On Wed, Jan 21, 2015 at 12:17 PM, Jafar Al-Gharaibeh <[email protected]>
wrote:

> David,
>
>     I added a threaded solution @
> http://rosettacode.org/wiki/Walk_a_directory/Recursively#Icon_and_Unicon
>    Please review/edit as you see fit. (The source file is attached).
> Combining recursion with thread might not be the best solution for this
> problem. If I were to put this in real use I'd go with an iterative
> approach using master/workers model. Anyway, this is a excellent
> demonstration on how to use threads!. The key features are:
>
>    1- How to create threads, limit their numbers, self-load balanced (new
> threads  are spawned at the time/place where needed. One they are done,
> they vanish allowing new threads to pop up in new places in the directory
> structure)
>    2- pass data and collect results to/from the threads using the new
> language features.
>
>
> Here is some sample output from my desktop machine (quad-core with
> mechanical HDD. I will try another machine with an SSD and see if more
> threads scale better).
>
> the first argument to the program is the target directory. The second is
> the maximum number of  concurrent threads to use at any given moment. (soft
> limit! my counters are "unmutexed", so the actual number might deviate).
> Note that this is different from the actual number of threads used during
> the run which is reported at the end. The program can create/destroy
> threads as needed, but cannot  use more than "max" # of threads at any
> given moment, and again "max" is "soft". :)
>
> Cheers,
> Jafar
>
> c:\proj>tdir c:\ 1
> 39708 directories in 99867 ms using 1 threads
>
> c:\proj>tdir c:\ 4
> 39708 directories in 62222 ms using 4 threads
>
> c:\proj>tdir c:\ 4
> 39708 directories in 87650 ms using 4 threads
>
> c:\proj>tdir c:\ 1
> 39708 directories in 92525 ms using 1 threads
>
> c:\proj>tdir c:\ 4
> 39708 directories in 95655 ms using 4 threads
>
> c:\proj>tdir c:\ 16
> 39708 directories in 66138 ms using 21 threads
>
> c:\proj>tdir c:\ 8
> 39708 directories in 69307 ms using 8 threads
>
> c:\proj>tdir c:\ 4
> 39708 directories in 70539 ms using 4 threads
>
> c:\proj>tdir c:\ 16
> 39708 directories in 76392 ms using 32 threads
>
>
>
> On Sun, Jan 11, 2015 at 1:25 PM, David Gamey <[email protected]>
> wrote:
>
>> Sergey,
>>
>> I am responsible for much of the Rosetta code contributions (thanks also
>> to Steve, Andrew, Matt, Peter, and about 4 others) and this one in
>> particular dating from 2010. As I recall this was before the
>> multi-threading versions were widely available. I think multi-threading is
>> underrepresented in Rosetta/Unicon.
>>
>> If you come up with a multi-threading version, we should add it to the
>> post as an alternative version.  If you don't feel comfortable doing this,
>> post the code and I can add it.
>>
>> David
>>
>>   ------------------------------
>>  *From:* Sergey Logichev <[email protected]>
>> *To:* Jafar Al-Gharaibeh <[email protected]>
>> *Cc:* Unicon group <[email protected]>
>> *Sent:* Sunday, January 11, 2015 1:16 AM
>> *Subject:* Re: [Unicon-group] Walk of file directory
>>
>> Jafar,
>>
>> Thank you for a whole bundle of advices and suggestions! Threads are
>> worth to try. The thought of search by file attributes is very useful too.
>> Your suggestion about slow I/O partly is right. For UNIX I tried the
>> program on Raspberry Pi with 6 Class microSD as HDD (it's slow, agree). But
>> for Windows it was quite fast HDD. It would be interesting to compare
>> performance of the program on Windows with classic approach based on Win32
>> _FINDFIRST, _FINDNEXT functions. I have threaded Delphi/Lazarus
>> implementations of this algorithm. Feel that it will be faster but in which
>> degree?
>>
>> Sergey
>>
>> 10.01.2015, 21:50, "Jafar Al-Gharaibeh" <[email protected]>:
>>
>>
>> Sergey,
>>
>>   There are so many things that came to mind when I saw your program.
>>
>> 1-  At the end of your email, sourceforge ad says "Go Parallel", Which is
>> not a bad idea for this highly parallel application.
>>
>>  There is a similar program "wordcount" listed in my dissertation
>> (available on unicon.org) that go through directories and count words in
>> every file using threads (Chapter 7, page 107)
>>
>> 2- Unicon open() already supports " pattern matching that would greatly
>> (I believe) speedup your program. For example you can do this:
>>     L := open("*.icn")
>>
>>    to get a list of all of Unicon source files in the current directory.
>>
>>   Note: It would be nice if there were a way to tell open() to return
>> files not only based on a pattern, but also on file attribute to allow
>> something like "get me all directories in the current directory", or "get
>> me all read only file". There are a lot of situations where filtering
>> directory names for example is very useful - like this program
>>
>> 3- The program on Rosetta Code is not optimized for speed. You can
>> minimize the number of lists created and put() by careful rewriting of the
>> code.
>>
>> 4- Depending on how deep the directory tree is, there might be a lot of
>> I/O going on. A slow disk might limit how fast you can go regardless of how
>> optimized your code is.
>>
>> I will share results if get around trying any of these options.
>>
>> Cheers,
>> Jafar
>>
>>
>>
>> On Sat, Jan 10, 2015 at 5:51 AM, Sergey Logichev <[email protected]>
>> wrote:
>>
>> Hello all!
>>
>> Now I investigate the best approach to get list of files in specified
>> directory and beneath in Unicon.
>> I found excellent example at rosettacode.org:
>> http://rosettacode.org/wiki/Walk_a_directory/Recursively#Icon_and_Unicon
>>
>> I reconstructed this one to implement matching of filenames to specified
>> pattern (regular expression). My program recursively walks a directory and
>> prints appropriate filenames. The same as dir (ls) does. All working fine
>> except performance. If directory has a lot of subdirs the search may took
>> 10-20 seconds before starting output. Could you provide some advices how to
>> enchance the performance?
>>
>> Some notes how to make and use. Unpack content of udir.zip to your local
>> directory. Define which environment you use in env.icn file - uncomment
>> line "$define _UNIX 1" in the case of UNIX. Nothing to do in the case of
>> Windows.
>> Make udir program:
>> unicon -c futils.icn
>> unicon -c options.icn
>> unicon -c regexp.icn
>> unicon udir.icn
>>
>> Usage: udir -f<filemask>
>> for example: udir -f*.icn
>> shall list of icn files in the current dir and all its subdirectories.
>>
>> Best regards,
>> Sergey Logichev
>>
>>
>> ------------------------------------------------------------------------------
>> Dive into the World of Parallel Programming! The Go Parallel Website,
>> sponsored by Intel and developed in partnership with Slashdot Media, is
>> your
>> hub for all things parallel software development, from weekly thought
>> leadership blogs to news, videos, case studies, tutorials and more. Take a
>> look and join the conversation now. http://goparallel.sourceforge.net
>> _______________________________________________
>> Unicon-group mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/unicon-group
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Dive into the World of Parallel Programming! The Go Parallel Website,
>> sponsored by Intel and developed in partnership with Slashdot Media, is
>> your
>> hub for all things parallel software development, from weekly thought
>> leadership blogs to news, videos, case studies, tutorials and more. Take a
>> look and join the conversation now. http://goparallel.sourceforge.net
>>
>> _______________________________________________
>> Unicon-group mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/unicon-group
>>
>>
>>
>

tdir.icn
Description: Binary data

------------------------------------------------------------------------------
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet

_______________________________________________
Unicon-group mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/unicon-group

Re: [Unicon-group] Walk of file directory

Reply via email to