Re: I/O Optimization

Don Poitras Wed, 05 Jun 2013 10:10:24 -0700

In article <a6cf87cbc0b60a459cb79af044a096db2227dca...@mailccr.us.syncsort.com> 
you wrote:
> The 'S' in FBS does not stand for SPANNED.  It stands for STANDARD meaning 
> all the blocks are a standard size.


> There are a few cautions to using FBS files.

> Do not MOD data to a FBS file.  A short block in an FBS file is a EOF 
> condition.  This means that if you luck out and fill the last buffer of a 
> file, you can MOD to it, but 99 44/100 percent of the time it doesn't work 
> and a program that goes to read the file doesn't see the added records in the 
> file.

You just have to make sure to rewrite the last block if it was short 
before appending.


> I can't recommend using FBS in the general case at all.  If you need random 
> access to data, you are probably better off using a VSAM KSDS file.

RRDS would be the closest match to this for VSAM. I guess it depends
on what you're used to. I'd much rather use FBS than deal with IDCAMS
and so forth. It's not trivial to do in assembler, but it's dog-easy
in C. 

> FBS can be useful in very specialized cases, but in general, avoid it.

It's the best native file fit for programmers wanting to use ftell(),
fseek() etc. These aren't "special" to unix programmers. 

> Chris Blaicher
> Principal Software Engineer, Software Development
> Syncsort Incorporated
> 50 Tice Boulevard, Woodcliff Lake, NJ 07677
> P: 201-930-8260  |  M: 512-627-3803
> E: [email protected]

> -----Original Message-----
> From: IBM Mainframe Discussion List [mailto:[email protected]] On 
> Behalf Of David Crayford
> Sent: Wednesday, June 05, 2013 8:19 AM
> To: [email protected]
> Subject: Re: I/O Optimization

> >> BTW, is the reason FBS is fast for seeks because it makes it easier
> >> to calculate the track position for a POINT? I've never seen FBS used 
> >> before.
> > I don't know what IBM uses under the covers, but it's probably the
> > same thing that SAS/C did. Calculate the CCHHR from the byte offset
> > and use EXCP to read the block directly. No need to use POINT. FBS is
> > guaranteed not to have any short blocks, so the calculation is trivial.
> >

> Where does "spanned" come into play? Why does that make the difference?

> >> On 5/06/2013 1:24 PM, Bernd Oppolzer wrote:
> >>> Yes, I guess, that the freads on ZFS will be faster, but:
> >>>
> >>> - the customer wants the files to be classical z/OS data sets, and
> >>>
> >>> - because of the cache (least recently used algoritm), the I/O
> >>> delays are of no real concern any more - the elapsed time is much
> >>> better than in the DB2 case, anyway.
> >>>
> >>> But: we observed that RECFM=FBS is needed; otherwise the fseek calls
> >>> are very slow (the tables have indexes which are B*-tree structures
> >>> and operate based on record numbers, so an efficient fseek call is
> >>> needed).
> >>>
> >>> Kind regards
> >>>
> >>> Bernd
> >>>
> >>>
> >>>
> >>> Am 05.06.2013 04:01, schrieb David Crayford:
> >>>> Did you try using a ZFS file system? On my system
> >>>> freads()/fwrites() to a unix file are significantly faster than 
> >>>> QSAM/BSAM.
> >>>>
> >>>> On 5/06/2013 9:33 AM, Bernd Oppolzer wrote:
> >>>>> Two weeks ago, I told you about tests with our table system, which
> >>>>> holds read-only data for our insurance math package.
> >>>>>
> >>>>> The data was stored in DB2 tables until now, and we tried to get
> >>>>> better CPU usage etc. by moving the data to our file based table
> >>>>> system, which we have in the Windows and Unix environments.
> >>>>>
> >>>>> First tests showed a reduction in CPU time, but an increased
> >>>>> elapsed time, due to I/O waits during fseek / fread calls. See
> >>>>> below.
> >>>>>
> >>>>> Now we examined this more deeply, and, as it turned out, the main
> >>>>> storage cache simply didn't work due to configuration errors. This
> >>>>> was the reason for the many file I/Os that occured and for the
> >>>>> massive I/O waits.
> >>>>>
> >>>>> We fixed this, and we set the cache size to 9 MB (the table sizes
> >>>>> are some hundred MB). Furthermore, the file attribute have to be
> >>>>> RECFM = FBS, to support the fseek operations in an optimal way.
> >>>>>
> >>>>> Doing this, we got the following results:
> >>>>>
> >>>>> CPU time and elapsed time is reduced to about 50 % of the original
> >>>>> value; and that is not only the time for the table access, but for
> >>>>> the whole computation !!
> >>>>>
> >>>>> That means, that the reduction in table access times must me still
> >>>>> much higher.
> >>>>>
> >>>>> This can be explained by the following things:
> >>>>>
> >>>>> a) because the table system and the cache is in the same address
> >>>>> space as the application, we have no address space switching
> >>>>> traffic, as we have with the DB2 solution
> >>>>>
> >>>>> b) DB2 is used by other requesters, too
> >>>>>
> >>>>> c) because the tables are read-only, the table system does nothing
> >>>>> with respect to transaction control etc, no locking and logging,
> >>>>> which makes it faster than DB2
> >>>>>
> >>>>> Of course, the drawback is:
> >>>>>
> >>>>> you have to spend 9 MB cache storage more in every region where
> >>>>> the application runs. But that's so little, that overall system
> >>>>> control has no problem with it.
> >>>>>
> >>>>> We do a mass test tomorrow to check out the behaviour of the
> >>>>> system, if there are many computations in a large number of
> >>>>> parallel jobs. If the results are the same, we will migrate the
> >>>>> system to the file based solution in the near future.
> >>>>>
> >>>>> Kind regards
> >>>>>
> >>>>> Bernd

-- 
Don Poitras - SAS Development  -  SAS Institute Inc. - SAS Campus Drive
[email protected]           (919) 531-5637                Cary, NC 27513

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Re: I/O Optimization

Reply via email to