Good call!
 

 

 

-----Original Message-----
From: Susan Lynch <sly...@fwdco.com>
To: U2 Users List <u2-users@listserver.u2ug.org>
Sent: Tue, Jul 23, 2013 9:54 am
Subject: Re: [U2] UniData Dynamic File Splitting Question


Cinda, wow!  When that 'file' is opened, at the Unix level, I believe it
has to open 39 files, which is a lot of I/O - rather than a single static
file - it looks like the data would be about 3,578,880 bytes plus the
empty space at the end of the groups - nowhere near big enough to have to
be dynamic.   Your minimum number of records per group is zero, so you do
have some empty groups, and your maximum number of records per group is
27, which, at an average record size of 44 bytes and a 1 K block size,
would definitely put you into level 1 overflow on the groups with large
numbers of records.

Since the file hashes unevenly, if you are keeping the current key
structure, I would increase the blocksize so that you can fit more records
per group, and my personal preference would be to make the file static,
with sizing something like 3733,2.   You could try creating a file that
size and copying the data into it - that should not take long, and then
you can see how it fits.  It should have ample room for growth, and you
would not have the overhead of splitting groups all the time, which it
seems to be doing.

I did not see whether your file was KEYDATA or KEYONLY - the way it has
split, I am guessing KEYDATA, but I might be wrong about that.  You might
try changing your split type, if you are determined to keep the file
dynamic.

Susan Lynch
F. W. Davison & Company, Inc.



-----Original Message-----
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Cinda Goff
Sent: Tuesday, July 23, 2013 12:05 PM
To: U2 Users List
Subject: Re: [U2] UniData Dynamic File Splitting Question

Sorry for the delayed response.  Posting was bad timing on my part because
I did not have access to the college files last week.  I have a copy of
UI.LOG.INFO file that I last posted about.  Below are 1) the unix level 2)
partial GROUP.STAT and 3) guide -d3

I have also been looking at hash type 1.  This file hashes about the same
but I'm checking with the vendor to see if I can convert a couple of the
college's files to hash type 1 to see if it prevents the splits.

Thanks for any insight.
C.

Unix Level of UI.LOG.INFO.

$ ls -al
total 10370
drwxrwx---   2 datatel  users       1024 Jul 11 07:52 .
drwxrwx--- 637 datatel  users      34304 Jul 23 10:52 ..
-rwxrwx---   1 datatel  users    1073152 Jul 22 15:41 dat001
-rwxrwx---   1 datatel  users     204800 Jul 22 15:15 dat002
-rwxrwx---   1 datatel  users      94208 Jul 22 14:23 dat003
-rwxrwx---   1 datatel  users     454656 Jul 22 15:51 dat004
-rwxrwx---   1 datatel  users      82944 Jul 22 14:51 dat005
-rwxrwx---   1 datatel  users      84992 Jul 18 17:36 dat006
-rwxrwx---   1 datatel  users     109568 Jul 22 09:27 dat007
-rwxrwx---   1 datatel  users      20480 Jul 22 09:59 dat008
-rwxrwx---   1 datatel  users      23552 Jul 16 16:24 dat009
-rwxrwx---   1 datatel  users      78848 Jul 22 11:55 dat010
-rwxrwx---   1 datatel  users     150528 Jul 22 14:50 dat011
-rwxrwx---   1 datatel  users       3072 Jun 17 17:28 dat012
-rwxrwx---   1 datatel  users      24576 Jul 18 16:28 dat013
-rwxrwx---   1 datatel  users     273408 Jul 22 15:01 dat014
-rwxrwx---   1 datatel  users      10240 Jul 22 14:04 dat015
-rwxrwx---   1 datatel  users      43008 Jul 22 14:32 dat016
-rwxrwx---   1 datatel  users     205824 Jul 22 13:34 dat017
-rwxrwx---   1 datatel  users      45056 Jul 22 13:06 dat018
-rwxrwx---   1 datatel  users     139264 Jul 22 15:23 dat019
-rwxrwx---   1 datatel  users     174080 Jul 22 15:35 dat020
-rwxrwx---   1 datatel  users      77824 Jul 22 09:05 dat021
-rwxrwx---   1 datatel  users      15360 Jul 22 15:23 dat022
-rwxrwx---   1 datatel  users    1735680 Jul 22 15:35 over001
-rwxrwx---   1 datatel  users       2048 Mar  7 21:17 over002
-rwxrwx---   1 datatel  users       2048 Nov 28  2012 over003
-rwxrwx---   1 datatel  users       2048 Jul  9 12:11 over004
-rwxrwx---   1 datatel  users       2048 Jun 26 07:42 over005
-rwxrwx---   1 datatel  users       2048 May  9 14:35 over006
-rwxrwx---   1 datatel  users       2048 May 14 16:57 over007
-rwxrwx---   1 datatel  users       2048 Jul  1 18:19 over008
-rwxrwx---   1 datatel  users       2048 Apr 30 10:21 over009
-rwxrwx---   1 datatel  users       2048 Apr 16 16:28 over010
-rwxrwx---   1 datatel  users       2048 Jul 22 13:19 over011
-rwxrwx---   1 datatel  users       2048 May 29 09:32 over012
-rwxrwx---   1 datatel  users       2048 Jun  6 12:06 over013
-rwxrwx---   1 datatel  users       2048 May 15 14:32 over014
-rwxrwx---   1 datatel  users       2048 Jul 11 21:17 over015
-rwxrwx---   1 datatel  users       2048 Jul  9 15:20 over016
-rwxrwx---   1 datatel  users       2048 Jul 17 09:14 over017
--------------------------------------------------------------------------
---------------------------------------------------

GROUP.STAT - I have the entire output but thought I would start with a
partial listing.  I did verify that the file looks pretty much the same
throughout and no empty groups.

:GROUP.STAT UI.LOG.INFO
File = UI.LOG.INFO modulo=3288 hash type=0 blocksize=1024 Split/Merge type
= KEYONLY Grp# Bytes  Records
  0   314     5>>>>>
  1   433     7>>>>>>>
  2   244     4>>>>
  3   506     8>>>>>>>>
  4   381     6>>>>>>
  5   585     9>>>>>>>>>
  6   430     7>>>>>>>
  7   585     9>>>>>>>>>
  8   808    13>>>>>>>>>>>>>
  9   259     4>>>>
 10   506     8>>>>>>>>
 11   509     8>>>>>>>>
 12   256     4>>>>
 13   375     6>>>>>>
 14   198     3>>>
 15   631    10>>>>>>>>>>
 16   256     4>>>>
 17   250     4>>>>
 18   198     3>>>
 19   494     8>>>>>>>>
 20   710    11>>>>>>>>>>>
 21   247     4>>>>
 22   378     6>>>>>>
 23   308     5>>>>>
 24   238     4>>>>
 25   247     4>>>>
 26   442     7>>>>>>>
 27   512     8>>>>>>>>
 28   488     8>>>>>>>>
 29   317     5>>>>>
 30   183     3>>>
 31   643    10>>>>>>>>>>
...
1790   735    12>>>>>>>>>>>>
1791   887    14>>>>>>>>>>>>>>
1792  1091    17>>>>>>>>>>>>>>>>>
1793  1381    22>>>>>>>>>>>>>>>>>>>>>>
1794  1271    20>>>>>>>>>>>>>>>>>>>>
1795   887    14>>>>>>>>>>>>>>
1796   881    14>>>>>>>>>>>>>>
1797  1204    19>>>>>>>>>>>>>>>>>>>
1798   564     9>>>>>>>>>
1799  1052    17>>>>>>>>>>>>>>>>>
1800   366     6>>>>>>
1801  1332    21>>>>>>>>>>>>>>>>>>>>>
1802   817    13>>>>>>>>>>>>>
1803   881    14>>>>>>>>>>>>>>
1804  1198    19>>>>>>>>>>>>>>>>>>>
1805   756    12>>>>>>>>>>>>
1806   753    12>>>>>>>>>>>>
1807   930    15>>>>>>>>>>>>>>>
1808   619    10>>>>>>>>>>
1809  1186    19>>>>>>>>>>>>>>>>>>>
1810  1082    17>>>>>>>>>>>>>>>>>
1811   948    15>>>>>>>>>>>>>>>
1812   497     8>>>>>>>>
1813  1262    20>>>>>>>>>>>>>>>>>>>>
1814   945    15>>>>>>>>>>>>>>>
1815  1122    18>>>>>>>>>>>>>>>>>>
1816  1058    17>>>>>>>>>>>>>>>>>
1817  1506    24>>>>>>>>>>>>>>>>>>>>>>>>
1818   610    10>>>>>>>>>>
1819   997    16>>>>>>>>>>>>>>>>
1820   625    10>>>>>>>>>>
1821   762    12>>>>>>>>>>>>
1822   771    12>>>>>>>>>>>>
1823   707    11>>>>>>>>>>>
1824  1213    19>>>>>>>>>>>>>>>>>>>
1825   555     9>>>>>>>>>
...
3275   552     9>>>>>>>>>
3276   570     9>>>>>>>>>
3277   445     7>>>>>>>
3278   329     5>>>>>
3279   695    11>>>>>>>>>>>
3280   552     9>>>>>>>>>
3281   564     9>>>>>>>>>
3282   256     4>>>>
3283   384     6>>>>>>
3284   256     4>>>>
3285   436     7>>>>>>>
3286   439     7>>>>>>>
3287   515     8>>>>>>>>
======= =====
  2813712   44736   Totals
      128     2   Minimum in a group
     1737    27   Maximum in a group
    855.8   13.6   Averages per group
    247.06  3.92  Standard deviation from average
    0.29  0.29  Percent std dev from average File has 17 over files, 22
prime files
--------------------------------------------------------------------------
----------------------------------------------------

guide output.  I did mask the userids in the key fields below.  Keys are
all userid and some unique timestamp information.

$ guide -d3 UI.LOG.INFO
$ more GUIDE*
::::::::::::::
GUIDE_ADVICE.LIS
::::::::::::::

data/UI.LOG.INFO
  Management advice:
         Running memresize may improve performance
    for access to the file.   This conclusion was reached
    for the following reasons:

       - File has 29 groups over split load.


Files processed:    1
Errors encountered: 0
::::::::::::::
GUIDE_ERRORS.LIS
::::::::::::::


Files processed:    1
Errors encountered: 0
::::::::::::::
GUIDE_STATS.LIS
::::::::::::::

data/UI.LOG.INFO
  Basic statistics:
    File type............................... Dynamic Hashing
    File size
      [dat001].............................. 1073152
      [dat002].............................. 204800
      [dat003].............................. 94208
      [dat004].............................. 454656
      [dat005].............................. 82944
      [dat006].............................. 84992
      [dat007].............................. 109568
      [dat008].............................. 20480
      [dat009].............................. 23552
      [dat010].............................. 78848
      [dat011].............................. 150528
      [dat012].............................. 3072
      [dat013].............................. 24576
      [dat014].............................. 273408
      [dat015].............................. 10240
      [dat016].............................. 43008
      [dat017].............................. 205824
      [dat018].............................. 45056
      [dat019].............................. 139264
      [dat020].............................. 174080
      [dat021].............................. 77824
      [dat022].............................. 15360
      [over001]............................. 1735680
      [over002]............................. 2048
      [over003]............................. 2048
      [over004]............................. 2048
      [over005]............................. 2048
      [over006]............................. 2048
      [over007]............................. 2048
      [over008]............................. 2048
      [over009]............................. 2048
      [over010]............................. 2048
      [over011]............................. 2048
      [over012]............................. 2048
      [over013]............................. 2048
      [over014]............................. 2048
      [over015]............................. 2048
      [over016]............................. 2048
      [over017]............................. 2048
    File modulo............................. 3288
    File minimum modulo..................... 101
    File split factor....................... 60
    File merge factor....................... 40
    File hash type.......................... 0
    File block size......................... 1024
    Free blocks in overflow file(s)......... 2
  Group count:
    Number of level 1 overflow groups....... 1708
    Primary groups in level 1 overflow...... 1693
    Primary groups over split factor........ 29
  Record count:
    Total number of records................. 44736
    Average number of records per group..... 13.61
    Standard deviation from average......... 3.92
    Minimum number of records per group..... 0
    Maximum number of records per group..... 27
    Median number of records per group...... 13.50
  Record length:
    Average record length................... 44.26
    Standard deviation from average......... 2.22
    Minimum record length................... 35
    Maximum record length................... 53
    Median record length.................... 44.00
  Key length:
    Average key length...................... 17.63
    Standard deviation from average......... 1.11
    Minimum key length...................... 13
    Maximum key length...................... 22
    Median key length....................... 17.50
  Data size:
    Average data size....................... 71.90
    Standard deviation from average......... 3.33
    Total data size......................... 3216336
    Minimum data size....................... 58
    Maximum data size....................... 85
    Median data size........................ 71.50
    Data in 1 - 512 bytes range............. 44736      (100.00%)
  Largest data size:
    85 bytes of data for this key........... "Jxxxxxxxxxxxx33205_10581"
    85 bytes of data for this key........... "Jxxxxxxxxxxxx44296_29096"
    85 bytes of data for this key........... "Jxxxxxxxxxxxx41414_12729"
  Smallest data size:
    58 bytes of data for this key........... "Mxxxxx62878_7"
    58 bytes of data for this key........... "Axxxx62898_90"
    58 bytes of data for this key........... "Dxxxx31883_54"
  Predicted optimal size:
    Records per block....................... 10
    Percentage of near term growth.......... 10
    Scalar applied to calculation........... 0.00
    Block size.............................. 1024
    Modulo.................................. 3288


Files processed:    1
Errors encountered: 0
--------------------------------------------------
Cinda Goff
N.C. Community College System
Database Administrator
919 807-7060
vRoom Link:
https://sas.elluminate.com/m.jnlp?password=M.BDCC127B096D131E11EAC16A0F947
3&sid=2008362

E-mail correspondence to and from this address may be subject to the North
Carolina Public Records Law and shall be disclosed to third parties when
required by the statutes. (NCGS.Ch.132)


_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

Reply via email to