Re: [Pvfs2-users] how to troubleshoot performance problems

Boyd Wilson Fri, 03 Jan 2014 18:36:12 -0800

Roger,

Yes the docs for the direct interface are at:
http://www.omnibond.com/orangefs/docs/v_2_8_6/index.htm
or the 2.8.8 ones at
http://www.omnibond.com/orangefs/docs/v_2_8_8/index.htm


in 2.8.8 (coming real soon), there are a few fixes, don't know if you will
run into the issues or not with 2.8.7 (just a heads up if your app is using
some rare calls they might have problems).  If the app is using stdio
(fread, fwrite, etc.) then simply switching to the usrint (direct
interface) will up the buffer from 1K to 1M and that might do a lot by
itself.

The cache will help if you are reading the same data over and over again.
 It currently uses 256K blocks, so it won't buffer as much as stdio, but
even that much will help a lot.  Repeat access will work MUCH better with
the cache.

The ucache has not made it to the docs yet, there is a README_UCACHE in the
src/client/usrint directory and a configuration helper script.

It might help if we could see strace output to help determine what would
help, or point us to the source to the app (if available).

thanks,
-Boyd, Walt and Jeff...


On Thu, Jan 2, 2014 at 6:07 PM, Moye,Roger V <[email protected]> wrote:

>  Boyd,
>
>
>
> These files are several GB in size and are read several bytes at a time up
> to maybe 64KB per read.   How do I enable the cache?   I checked the PVFS
> web site and I only see information on client side attribute caching.
>
>
>
> As for preloading the direct interface, I am not sure what this is.  Are
> you referring to this:
>
>
> http://www.omnibond.com/orangefs/docs/v_2_8_5/Direct_Interface.htm#Program_Configuration
>
>
>
> If so, some of our apps are third party so we do not have the option to
> recompile them.  I would be interested in trying the Global Interface.
> However, am I correct that if we did not specifically configure PVFS to
> build this interface then it is not available to us unless we rebuild PVFS
> with the options enabled?
>
>
>
> If we use the Program Configuration option (for those codes that we can
> recompile), do you have any thoughts on how much of a performance boost we
> might see for small reads?
>
>
>
> Thanks a million!
>
>
>
> -Roger
>
>
>
> -----------------------------------------------------------
>
> Roger V. Moye
>
> Systems Analyst III
>
> XSEDE Campus Champion
>
> University of Texas - MD Anderson Cancer Center
>
> Division of Quantitative Sciences
>
> Pickens Academic Tower - FCT4.6109
>
> Houston, Texas
>
> (713) 792-2134
>
> -----------------------------------------------------------
>
>
>
> *From:* [email protected] [mailto:[email protected]] *On Behalf Of *Boyd
> Wilson
> *Sent:* Thursday, January 02, 2014 11:20 AM
> *To:* Moye,Roger V
> *Cc:* [email protected]; [email protected]
>
> *Subject:* Re: [Pvfs2-users] how to troubleshoot performance problems
>
>
>
> Roger,
> are the files written once and read several times?   If so trying to
> preload the  direct interface for the app and enabling the cache might help
> depending on how often the files are read.
>
> -boyd
>
>
>
> On Thu, Jan 2, 2014 at 11:50 AM, Moye,Roger V <[email protected]>
> wrote:
>
> Becky,
>
>
>
> I’ve been looking into this problem and your suggestion.   The users apps
> that are causing problems are reading 64KB per read all the way down to 1
> Byte per read and they spending much of their run time reading input.
>    Many are doing 4KB to 8KB per read.    I have read the PVFS FAQ about
> tuning individual directories.
>
>
>
> I confess I do not know what my options are regarding tuning of the
> directories to improve performance.  What is the purpose of the four
> different types of distributions?   What stripe size should I use?    I
> realize that PVFS is not suited for reads this small but **any**
> performance improvement that I can get for these particular problem jobs
> would be helpful.
>
>
>
> Thanks!
>
> -Roger
>
> -----------------------------------------------------------
>
> Roger V. Moye
>
> Systems Analyst III
>
> XSEDE Campus Champion
>
> University of Texas - MD Anderson Cancer Center
>
> Division of Quantitative Sciences
>
> Pickens Academic Tower - FCT4.6109
>
> Houston, Texas
>
> (713) 792-2134
>
> -----------------------------------------------------------
>
>
>
> *From:* Becky Ligon [mailto:[email protected]]
> *Sent:* Monday, December 16, 2013 4:10 PM
> *To:* Moye,Roger V
> *Cc:* Kyle Schochenmaier; [email protected]
>
>
> *Subject:* Re: [Pvfs2-users] how to troubleshoot performance problems
>
>
>
> Roger:
>
> In general, if your filesystem has x-number of servers and you have used
> the default 64K stripe size, then you would want to be reading or writing
> in *at least* (64K * number of servers) bytes at a time (but preferably
> more) in order to take advantage of the parallelism.  You also want to
> minimize the *number* of files that you create/delete in one job, since
> these operations require additional metadata accesses.
>
> These are guide lines not rules.  Looking at what kind of reads/writes and
> file accesses are being used is the best way to tune your filesystem for a
> particular purpose.  Keep in mind that directories and files can have
> different attributes than those specified in the config file as the
> defaults.  So, you can tune files or files in a directory to use a
> different number of servers, a different stripe size, etc.
>
> Hope this little bit of information is helpful.
>
> Becky
>
>
>
> On Mon, Dec 16, 2013 at 3:33 PM, Moye,Roger V <[email protected]>
> wrote:
>
> Becky,
>
>
>
> You nailed it:
>
>
>
> read(7, "4", 1)                         = 1
>
> read(7, "|", 1)                         = 1
>
> read(7, "4", 1)                         = 1
>
> read(7, "2", 1)                         = 1
>
> read(7, "0", 1)                         = 1
>
> read(7, "6", 1)                         = 1
>
> read(7, "5", 1)                         = 1
>
> read(7, "1", 1)                         = 1
>
> read(7, "1", 1)                         = 1
>
> read(7, "2", 1)                         = 1
>
> read(7, "|", 1)                         = 1
>
> read(7, "4", 1)                         = 1
>
> read(7, "2", 1)                         = 1
>
> read(7, "0", 1)                         = 1
>
> read(7, "6", 1)                         = 1
>
> read(7, "5", 1)                         = 1
>
> read(7, "1", 1)                         = 1
>
> read(7, "3", 1)                         = 1
>
>
>
> He’s doing this from multiple processes on multiple nodes.
>
>
>
> Question to you:  Is there a rule of thumb to follow for ‘how small is too
> small’?
>
>
>
> -Roger
>
>
>
>
>
> -----------------------------------------------------------
>
> Roger V. Moye
>
> Systems Analyst III
>
> XSEDE Campus Champion
>
> University of Texas - MD Anderson Cancer Center
>
> Division of Quantitative Sciences
>
> Pickens Academic Tower - FCT4.6109
>
> Houston, Texas
>
> (713) 792-2134
>
> -----------------------------------------------------------
>
>
>
> *From:* Becky Ligon [mailto:[email protected]]
> *Sent:* Monday, December 16, 2013 1:25 PM
> *To:* Kyle Schochenmaier
> *Cc:* Moye,Roger V; [email protected]
> *Subject:* Re: [Pvfs2-users] how to troubleshoot performance problems
>
>
>
> Roger:
>
> I have also seen some codes that read/write one byte at a time, which is
> not appropriate for a parallel filesystem.  Try this:  While the user's
> process is running, attach to it with strace and see what kind of
> read/writes are being issued.
>
> Becky
>
>
>
> On Mon, Dec 16, 2013 at 1:57 PM, Becky Ligon <[email protected]> wrote:
>
> Roger:
>
> Are all of your filesystem servers ALSO metadata servers?
>
> Becky
>
>
>
> On Mon, Dec 16, 2013 at 1:18 PM, Kyle Schochenmaier <[email protected]>
> wrote:
>
> There are some tuning params that you can look into here, by default there
> is a round robin loading on the servers and that is done in chunks of
> FlowBufferSize (iirc?), you can set this in your config file but by default
> the size is quite small (64k) and I've pushed it up over 1-2MB and seen
> drastic improvements in bandwidth for larger requests; but if you're doing
> tons of small requests this obviously wont help.
>
>
>
> Can you attach your config file so we can see how things are setup?
>
>
>
>
>
>
>  Kyle Schochenmaier
>
>
>
> On Mon, Dec 16, 2013 at 11:57 AM, Moye,Roger V <[email protected]>
> wrote:
>
>
>
> Over the past weekend one of my users reported that his compute jobs
> running on a server with local disks usually takes about 5 hours.  However,
> running the same jobs on our small Linux cluster using a PVFS filesystem
> exceeded 24 hours.
>
>
>
> Here is the environment we are using:
>
> 1.        RHEL 6.4 on PVFS servers and clients.
>
> 2.       Computations are performed on any of 16 Linux clients, all
> running RHEL 6.4.
>
> 3.       We are running Orangefs-2.8.7.
>
> 4.       We have 4 PVFS servers, each with an XFS filesystem on a ~35TB
> RAID 6.  Total PVFS filesystem is 146TB.
>
> 5.       All components are connected via a 10GigE  network.
>
>
>
> I started looking for the source of the problem.   For the user(s) showing
> this poor performance, I found that pvfs-client is using about 65% of the
> CPU while the compute jobs themselves are using only 4% each.    Thus the
> compute nodes are very lightly loaded and the compute jobs are hardly doing
> anything.    The pvfs2-server process on each PVFS server is using about
> 140% CPU.   No time is being spent in the wait state (so I assume the speed
> of the disks are not an issue).    While the system was exhibiting poor
> performance I tried to read/write some 10GB  files myself and found the
> performance to be normal for this system (around 450MB/s).   I used ‘iperf’
> to measure the network bandwidth between the affected nodes and the PVFS
> serves and found it normal at 9.38Gb/s.  The directories that the users are
> reading/writing only have a few files in each.
>
>
>
> Iostat shows that the disk system is being constantly read by something as
> shown by ‘iostat –d 2’ on the PVFS servers:
>
> Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
>
> sda               0.00         0.00         0.00          0          0
>
> sdb              19.00      4864.00         0.00       9728          0
>
> dm-0              0.00         0.00         0.00          0          0
>
> dm-1              0.00         0.00         0.00          0          0
>
>
>
> iostat has looked like over the last 48 hours (since Saturday).
>
>
>
> I can not find any documentation on how to get stats directly from pvfs2
> so I tried this command:
>
> pvfs2-statfs –m /pvfs2-mnt
>
>
>
> I received these results:
>
> I/O server statistics:
>
> ---------------------------------------
>
>
>
> server: tcp://dqspvfs01:3334
>
>         RAM bytes total  : 33619419136
>
>         RAM bytes free   : 284790784
>
>         uptime (seconds) : 14499577
>
>         load averages    : 0 0 0
>
>         handles available: 2305843009213589192
>
>         handles total    : 2305843009213693950
>
>         bytes available  : 31456490479616
>
>         bytes total      : 40000112558080
>
>         mode: serving both metadata and I/O data
>
>
>
> server: tcp://dqspvfs02:3334
>
>         RAM bytes total  : 33619419136
>
>         RAM bytes free   : 217452544
>
>         uptime (seconds) : 14499840
>
>         load averages    : 0 0 0
>
>         handles available: 2305843009213589104
>
>         handles total    : 2305843009213693950
>
>         bytes available  : 31456971476992
>
>         bytes total      : 40000112558080
>
>         mode: serving both metadata and I/O data
>
>
>
> server: tcp://dqspvfs03:3334
>
>         RAM bytes total  : 33619419136
>
>         RAM bytes free   : 428965888
>
>         uptime (seconds) : 5437269
>
>         load averages    : 320 192 0
>
>         handles available: 2305843009213588929
>
>         handles total    : 2305843009213693950
>
>         bytes available  : 31439132123136
>
>         bytes total      : 40000112558080
>
>         mode: serving both metadata and I/O data
>
>
>
> server: tcp://dqspvfs04:3334
>
>         RAM bytes total  : 33619419136
>
>         RAM bytes free   : 223281152
>
>         uptime (seconds) : 10089825
>
>         load averages    : 1664 3072 0
>
>         handles available: 2305843009213588989
>
>         handles total    : 2305843009213693950
>
>         bytes available  : 31452933193728
>
>         bytes total      : 40000112558080
>
>         mode: serving both metadata and I/O data
>
>
>
> Notice that the ‘load averages’ are 0 for servers #1 and #2 but not #3 and
> #4.   Earlier this morning only #4 showed a non-zero load average.  The
> other three were 0.  What does this number mean?
>
>
>
> My two theories about the source of the problem are:
>
> 1.        Someone is doing ‘a lot’ of tiny reads.
>
> 2.       Or, based on the load averages the PVFS filesystem is somehow
> not balanced.   All of the load is on a single server.
>
>
>
> How can I prove either of these?  Or what other types of diagnostics can I
> do?
>
>
>
> Thank you!
>
> -Roger
>
>
>
> -----------------------------------------------
>
> Roger V. Moye
>
> Systems Analyst III
>
> XSEDE Campus Champion
>
> University of Texas - MD Anderson Cancer Center
>
> Division of Quantitative Sciences
>
> Pickens Academic Tower - FCT4.6109
>
> Houston, Texas
>
> (713) 792-2134
>
> -----------------------------------------------------------
>
>
>
>
>
> _______________________________________________
> Pvfs2-users mailing list
> [email protected]
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>
>
>
>
> _______________________________________________
> Pvfs2-users mailing list
> [email protected]
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>
>
>
>   --
> Becky Ligon
> OrangeFS Support and Development
> Omnibond Systems
> Anderson, South Carolina
>
>
>
>
> --
> Becky Ligon
> OrangeFS Support and Development
> Omnibond Systems
> Anderson, South Carolina
>
>
>
>
> --
> Becky Ligon
> OrangeFS Support and Development
> Omnibond Systems
> Anderson, South Carolina
>
>
> _______________________________________________
> Pvfs2-users mailing list
> [email protected]
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>
>
>

_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Re: [Pvfs2-users] how to troubleshoot performance problems

Reply via email to