Re: improve performance of a script

Eduardo A . Bustamante López Wed, 26 Mar 2014 06:48:11 -0700

(I forgot to CC the list in my first reply)

On Tue, Mar 25, 2014 at 07:12:16AM -0700, xeon Mailinglist wrote:
> For each file inside the directory $output, I do a cat to the file and 
> generate a sha256 hash. This script takes 9 minutes to read 105 files, with 
> the total data of 556MB and generate the digests. Is there a way to make this 
> script faster? Maybe generate digests in parallel?
> 
> for path in $output
> do
>     # sha256sum
>     digests[$count]=$( $HADOOP_HOME/bin/hdfs dfs -cat "$path" | sha256sum | 
> awk '{ print $1 }')
>     (( count ++ ))
> done
> 
> 
> Thanks,
You were already told in #bash at Freenode that this is not a bash
issue, and yet, you report it as a bug.


Once bash runs the commands, it has no relation at all with their
performance.

Rather, ask the Hadoop people and also maybe the support for your
operating system to see what you can do to optimize that. Maybe it
cannot be optimized... it depends on what the bottleneck is (disk,
network, etc.)

-- 
Eduardo Alan Bustamante López

Re: improve performance of a script

Reply via email to