I was going through the concurrent directory traversal program (similar to 
the unix du command) provided as an example in the "Go Programming 
Language" book (by Kernighan and Donovan). The example is on lines of the 
following (slightly modified to clarify the question):


*// semaphore to control the number of open file descriptorsvar sem = 
make(chan struct{}, 20)*

func dirents(dirName string) []os.FileInfo {
  




* defer func() {       <-sem   }()   // Acquire a semaphore to read a Dir   
sem <- struct{}{}*

    entrySlice, err := ioutil.ReadDir(dirName)
    
    if err != nil {
        return nil
    }
    return entrySlice
}


func walkDir(directoryName string, filesizeChan chan int64, wg 
*sync.WaitGroup) {
     defer wg.Done()

     for _, entry := range dirents(name) {
         if entry.IsDir() {
              // if it is a directory, recurse 
              subDirName := ...
              wg.Add(1)
              go walkDir(...)
          } else {
              // if it is a file, send down the size
              filesizeChan <- entry.size 
         }
    }
}

The above example basically recurses through a directory 
(ioutil.ReadDir()), if it is a file, then send its size, if it is a 
directory, recurse again.
A semaphore is used to control the number of open file descriptors (control 
ioutil.ReadDir()). Lets assume we have a deeply nested structure of files 
and directories, there
could potentially be 200 goroutines active in the system with only 20 
proceeding and the remaining blocked. This is because the program in the 
example first spawns a goroutine 
and then potentially blocks on the semaphore. 

Wouldn't is be better to first acquire the semaphore and then spawn the 
goroutine. This way we are limiting the number of the goroutines that would 
remain blocked at any given moment. i.e the walkDir function would look 
like this :
 
func walkDir(directoryName string, filesizeChan chan int64, wg 
*sync.WaitGroup) {
    defer wg.Done()

   




* defer func() {         <-sem    }()    // Acquire a semaphore be 
proceeding    sem <- struct{}{}*

    for _, entry := range dirents(name) {
         if entry.IsDir() {
              // if it is a directory, recurse 
              subDirName := ...
              wg.Add(1)
              go walkDir(...)
          } else {
              // if it is a file, send down the size
              filesizeChan <- entry.size 
         }
    }
}


This approach would limit the number of goroutines active in a system. 
Wouldnt this be a better approach ? 
Essentially, the question boils downs to either spawn a goroutine and then 
block or block and then spawn a goroutine.

Thanks,
Nakul

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to