Hope this is the right place to ask...
If not, feel free to tell me to 'go away' as long as you indicate a
better place.

anyway... I am experimenting with getpos/setpos on OpenVMS for large
files (> 2GB) and have some trouble using perl 5.8.6 on I64.

I realize I should try 5.8.10, and will install that asap.

My first issue, being an RMS hacker used the RFA's, is the getpos
essentially returns the position for the next record which might not
exist (yet). I guess that just takes getting used to.

The getpos _should_ return an opaque value (likely to be 2 int).
It doesn't.
It returns a 32 bit 'tell' like byte position taking the RFA VBN
minus 1 times 512 plus byte offset. What is worse, the same formula is
used for indexed files where that 'byte offset' (rab$w_rfa4) is really
a record ID which typically is less than 100 or so, but for certain
usages can be several thousands, up to 65K. So the 'offset' can add
into the vbn space.

btw...  gcc returns a 4 byte vbn,  2 byte ID, and 2 byte offset within
the record.

For SETPOS, only values up to 2GB seem to work!
Some code out there is using signed variables it seems!
Above 4GB all hope is lost.

Test below.
Comments?
Thanks!
Hein van den Heuvel

To test I created a 5GB file using 50,000,000 records of 99bytes + 1
newline. Do NOT use default IO. Takes too long!
Pre-alloated of course. And use sqo to prevent high-water marking
going crazy, 4 large buffers is enough (only ever saw 2 being written
from, while an other 1 w as being filled).

--------------------- create_big.pl ------------------
use VMS::Stdio (qw(&vmsopen));
$fh = vmsopen(qw(>big.tmp alq=10000000 fop=sqo ctx=rec mbc=112 mbf=4 rop=wbh));
printf $fh qq(%09d%90s\n),$_,q(*) foreach (1..50000000);

-------------------- test_big.pl -----------------
use VMS::Stdio (qw(&vmsopen));
use filehandle;
my $i = 0;
my $records = 0;
my $position;
my @positions;

my $fh = vmsopen(qw(<big.tmp fop=sqo ctx=rec mbf=8 rop=rah)) or die "$!";
  while (<$fh>) {
    # real work goes here...
    $records++; # just count for now.
    if (!($records % 1000000)) {
      $position = $fh->getpos;
      my ($a,$b) = unpack('LL',$position);
      printf "$records records at (%0X,%0X) %s\n", $a, $b, substr($_,0,20);
      $positions[++$i] = $position;
    }
}
print "$records records.\n";
undef $fh;
my $fh = vmsopen(qw(<big.tmp ctx=rec mbf=1 mbc=1)) or die "$!";
foreach (1 .. $i) {
  $fh->setpos($positions[$_]);
  $line = <$fh>;
  printf ("%4d %s\n", $_, substr ($line,0,20))
}


I also wrote a much similar program using C, give or take an off-by-one.

#include <stdio.h>
#include <stdlib.h>
#include <rms.h>

struct RAB *rab;

int acc_callback(int *first_arg, struct FAB *fab, struct RAB *the_rab) {
rab = the_rab; // stash it away
return 0;
}

int main(int argc, char *argv[]) {
FILE *fp;
int stat, i=0, j, records=0, *p;
char buffer[2048], ch;
fpos_t positions[100];

if ((fp = fopen(argv[1], "r", "acc", &acc_callback, 0,
  "fop=sqo", "ctx=rec", "mbf=8", "mbc=112", "rop=rah")) == NULL) {
  perror("open");
  exit(1);
  }

while ( fgets(buffer, sizeof buffer, fp)) {
  records++;
  if (!(records%1000000)) {
     if (fgetpos(fp, &positions[i]) != 0) {
        perror("fgetpos");
        exit(1);
        }
     buffer[20]=0;
     p = (void *) &positions[i];  // Helper for opaque item hex printout
     printf ("%5d, pos: %08x %08x  rfa: %08x,%04x %s\n",
        i, p[0], p[1], rab->rab$l_rfa0, rab->rab$w_rfa4,buffer);
     if (++i >= 100) break; // Emergency exit
     }
  }

fclose(fp);
if ((fp = fopen(argv[1], "r", "ctx=rec", "mbf=1", "mbc=1")) == NULL) {
  perror("open");
  exit(1);
  }
for (j = 0; j<i;  j++) {
     if (fsetpos(fp, &positions[j]) != 0) {
        perror("fsetpos");
        exit(1);
        }
     fgets(buffer, sizeof buffer, fp);
     buffer[20]=0;
     printf ("%5d, %s\n", j, buffer);
     }
}


Here are the test outputs, with the perl and C results merged, and a
little manual column aligning:

---------- perl result ----------           C-rtl getpos        RMS RFA
 1000000 records at ( 5F5E100,0) 001000000  0002faf0 00000100   0002faf1,0100
 2000000 records at ( BEBC200,0) 002000000  0005f5e1 00000000   0005f5e2,0000
 3000000 records at (11E1A300,0) 003000000  0008f0d1 00000100   0008f0d2,0100
:
21000000 records at (7D2B7500,0) 021000000  003e95ba 00000100   003e95bb,0100
22000000 records at (83215600,0) 022000000  004190ab 00000000   004190ac,0000
:
42000000 records at (FA56EA00,0) 042000000  007d2b75 00000000   007d2b76,0000
43000000 records at (  4CCB00,0) 043000000  00802665 00000100   00802666,0100
44000000 records at ( 642AC00,0) 044000000  00832156 00000000   00832157,0000
:
50000000 records.
--- Perl setpos ---                         C-rtl set pos
   1 001000001                              001000001
   2 002000001                              002000001
   3 003000001                              003000001
:
  21 021000001                              022000001
  22                                        023000001
:
  42                                        043000001
  43 50328                                  044000001
:
  50 50328

    0, pos: 0002faf0 00000100  rfa: 0002faf1,0100 001000000
    1, pos: 0005f5e1 00000000  rfa: 0005f5e2,0000 002000000
    2, pos: 0008f0d1 00000100  rfa: 0008f0d2,0100 003000000
:
   20, pos: 003e95ba 00000100  rfa: 003e95bb,0100 021000000
   21, pos: 004190ab 00000000  rfa: 004190ac,0000 022000000
:
   41, pos: 007d2b75 00000000  rfa: 007d2b76,0000 042000000
   42, pos: 00802665 00000100  rfa: 00802666,0100 043000000
   43, pos: 00832156 00000000  rfa: 00832157,0000 044000000
:
    0, 001000001
    1, 002000001
    2, 003000001
:
   21, 022000001
   22, 023000001
:
   42, 043000001
   43, 044000001

Reply via email to