Your code here contains a variety of unknowns for us. Names like 'vars', 'blockmat', 'extractcol', 'catenate', and 'rembcol' are all under-specified here.
That said, one approach you could use to start isolating this issue would be to construct a unique list of numbers which are near 0.5 in value. Save them in a global list somewhere. And, then instrument your code with something which captures the first instance (block number, line number or character number within that block, position in your global list) of each of those values (using an exact match (something with comparison tolerance 0 like =!.0 e.!.0) to detect them). When working with large sets of data, a critical skill is being able to extract representative small and inspectable sets of data when isolating problems and/or testing and/or developing your code. This has always been the case regardless of the language and coding environment you're using. It's not easy. Take care, -- Raul On Fri, Aug 20, 2021 at 6:14 AM Pablo Landherr <pablo.landh...@gmail.com> wrote: > > Unfortunately, the raw .txt-file is 6.6 GB. I read it using freadblock > which gives me character arrays. The data is CSV-like (TAB-separated), so I > extract fields (still in character form) and then assign each field (or > column) to separate J-variables. The fields I know are numeric are > converted using _1". > > The actual import contains this code: > > pos=. 0 [ i=. 0 > > while. 0<$>0{block=. freadblock y;pos do. > > pos=.>1{block > > i=.>:i > > blockmat=.,;._2 >0{block > > for_ix. i.#vars do. > > vname=.x,>0{ix{vars > > select. >1{ix{vars > > case. 'N' do. ". vname,'=:',vname,',reducrank _1".blockmat extractcol > (>2{ix{vars);',":>3{ix{vars > > case. 'C' do. ". vname,'=:',vname,' catenate rembcol blockmat extractcol > (>2{ix{vars);',":>3{ix{vars > > end. > > end. > > end. > > extractcol=: 4 : 0 NB. textmat extractcol separator;position > > 'sep pos'=.y > > x=.}."1 ((+/\&.|:x e. sep)i."1 >:pos)|."0 1 x=.({.sep),.x > > (+/"1 *./\"1 -.x e. sep){."0 1 x > > ) > > reducrank=: 3 : 0 NB. remove singleton dimensions > > (1-.~$y)$,y > > ) > > > Each freadblock give ~1e6 characters and after doing ,;._2 >0{block typical > rows looks like this: > > > TopOfBookDelta 20160627.070001.177063277 0 0 18634 0:f00100 18139 124.7 > 4771 125 18339 124.7 4771 125 > > TopOfBookDelta 20160627.070001.177063278 0 0 18634 0:65fc000 18339 124.7 > 4771 125 18339 124.7 5584 125 > > TopOfBookDelta 20160627.070001.177063279 0 0 18634 0:304e00 18339 124.7 > 5584 125 18346 124.7 5584 125 > > TopOfBookDelta 20160627.070001.177063280 0 0 18634 0:25aae00 18346 124.7 > 5584 125 18546 124.7 5584 125 > > TopOfBookDelta 20160627.070001.177063281 0 0 18634 0:a98800 18546 124.7 > 5584 125 18546 124.7 5984 125 > > TopOfBookDelta 20160627.070001.177063282 0 0 18634 0:db1100 18546 124.7 > 5984 125 18546 124.7 6384 125 > > TopOfBookDelta 20160627.070001.177063283 0 0 18634 0:e4b700 18546 124.7 > 6384 125 18546 124.7 7384 125 > > TopOfBookDelta 20160627.070001.177063284 0 0 18634 0:e5e000 18546 124.7 > 7384 125 18546 124.7 7474 125 > > TopOfBookDelta 20160627.070001.177063285 0 0 18634 0:126f300 18546 124.7 > 7474 125 18546 124.7 7574 125 > > TopOfBookDelta 20160627.070001.177063286 0 0 18634 0:138c500 18546 124.7 > 7574 125 18546 124.7 7727 125 > > (Yes, the time stamps use nanoseconds) > > This generates 16 J-arrays (some character, some numerical) all with the > same # (in this case 55455236). I can see how troubleshooting this would be > difficult without the original data file but sharing that isn't practical. > But I think the transformation and assignment of the values to my > J-variables is uncomplicated. As you can see, the assignment of the > numerical fields uses only two verbs: reducrank and _1". I would really > like to understand how they create these minute variations of '0.05'. > > Thanks, > Pablo > > > On Fri, Aug 20, 2021 at 10:54 AM 'Mike Day' via Programming < > programm...@jsoftware.com> wrote: > > > But are they all evaluated in one go, or are you already buffering the > > data, as suggested by several correspondents? If the latter is the case, > > different buffers might give rise to “different” values... > > > > But we haven’t seen your data. Can you reproduce the apparent error on an > > example that’s small enough to share with the forum? > > > > Cheers, > > > > Mike > > > > Sent from my iPad > > > > > On 20 Aug 2021, at 09:16, Pablo Landherr <pablo.landh...@gmail.com> > > wrote: > > > > > > Just noticed I wrote *do* wrong. I meant ". > > > > > > Why would ". convert the same repeated character string into different > > (on > > > a bit level) numbers? > > > > > > On Thu, Aug 19, 2021 at 7:33 PM Pablo Landherr <pablo.landh...@gmail.com > > > > > > wrote: > > > > > >> OK, I buy that. However, the origin of the data makes this dubious. The > > >> data is imported as characters, so what I have is a bunch (thousands) of > > >> '0.05' converted in one go by ": > > >> Why would ": convert the same repeated character string of '0.05' into > > >> different numbers on a bit level? > > >> > > >> Kr, > > >> Pablo > > >> > > >> On Thu, Aug 19, 2021 at 7:13 PM Brian Schott <schott.br...@gmail.com> > > >> wrote: > > >> > > >>> Try this? > > >>> > > >>> cf =: 3 : '(~.y),.#/.~y ' > > >>> imp =: -,+ > > >>> 0.05 imp 0.1*9!:18 '' > > >>> 0.05 0.05 > > >>> ~. 0.05 imp 0.1*9!:18 '' > > >>> 0.05 0.05 > > >>> cf 0.05 imp 0.1*9!:18 '' > > >>> 0.05 1 > > >>> 0.05 1 > > >>> > > >>> > > >>> > > >>> -- > > >>> (B=) > > >>> ---------------------------------------------------------------------- > > >>> For information about J forums see http://www.jsoftware.com/forums.htm > > >>> > > >> > > > ---------------------------------------------------------------------- > > > For information about J forums see http://www.jsoftware.com/forums.htm > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm