This was my original code which ran  `forever'.  Were the amendments truly in place?  One array is sparse, the other pre-allocated.  Comments show observations on task manager.  I don't have much experience with Windows beyond the usual Office programs.  (PS. I now realize I could have discarded the "T" and stored the remaining number in the sparse array.)

NB. According to task manager, 7GByte free (of 16 GB)
NB. and j process steadily fluctuated by 10 Mbytes.

NB. is this a mapped file issue on Windows 10
NB. or were the amendments not in place?

NB. file detail
NB. fields   into    rows * columns
NB. 67653078 *inv 1183748 *    2141
NB. 37.4618 percent filled

NB. c:/Users/user/Downloads/j904_win64/j904/bin/jconsole.exe
NB. JVERSION
NB. Engine: j904/j64avx/windows
NB. Beta-e: commercial/2022-07-16T19:25:02
NB. Library: 9.04.03
NB. Platform: Win 64
NB. Installer: J904 install
NB. InstallPath: c:/users/user/downloads/j904_win64/j904
NB. Contact: www.jsoftware.com

require 'jmf'

testfile=:'c:/Users/user/temp/tc.csv'
datafile=:'c:/Users/user/ZW/kaggle.com/bosch-production-line-performance/train_categorical.csv'

NB. INF {~ 0 indexes rows
NB. gets data of first row
indexes=: (>:@{. + [: i.@<: -~/)@({ ~ 0 1&+)~

tokenize=: 3 :0  NB. y is the literal
 rows=. _1 , I. LF = y
 row_tally=. <: # rows
 row=. col=. 0
 k=. _1  NB. current data index
 col_tally=. >: +/ ',' = y {~ 0 indexes rows  NB. tally of columns
 data=: a: $~ col_tally + +/ 'T' = y  NB. columns + those with data, skipping ID
 NB. coor shall be sparse
 NB. coor=. ((<: # rows) , col_tally) $ _1
 coor=: 1 $. ((<: # rows) , col_tally) ; 0 1 ; _1 NB. coordinates of data
 while. row < 9 >. row_tally do.
  fields=. ([: <;._2 ,&',') y {~ row indexes rows
  cols=. }. I. a: ~: fields  NB. indexes of data in row excluding ID
  po=. (>: k) + i. # cols    NB. positions of these items in data
  co=. < row ; cols          NB. location in sparse array to store po
  da=. cols { fields


  NB. coor is sparse
  coor=: po co} coor   NB.NB.NB. assignments in place?
  NB. data is preallocated
  data=: da po} data   NB.NB.NB. assignments in place?


  k=. k + # cols
  row=. >: row
 end.
 'data and coor are global'
)


JCHAR map_jmf_'INF';testfile ] datafile

tokenize INF

unmap_jmf_'INF'

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to