You should probably take a look at what Jd has to ingest delimited files like this into a database.
On Tue, Mar 31, 2020 at 12:58 AM HH PackRat <[email protected]> wrote: > Finishing up with function #4...... > > I have a very large file consisting of multiple sets of historical > stock prices that I would like to split into individual files for each > stock. (I'll probably first have to write out all the files to a USB > flash drive [I have limited hard drive space, but it might work as a > very tight fit] and then, when finished, burn them to a DVD-ROM for > more permanent storage.) Since I thought that J was capable of > handling very large files, I figured that this might be a challenge to > try. > > Unfortunately, I don't know how to handle file reading where you might > only be able to read a part of the file at a time. (I don't know how > large a file J can read--maybe it can read the whole file.) This file > has 14,937,606 lines and is 1.63 GB (1,759,801,721 bytes) in size. > > Additionally (and probably most importantly), I don't know how to > collect a subset of the contents of a file to output to a file, and > then resume where J left off and collect the next subset of data to > output, and so on. > > I'm going to need a LOT of help with this J programming! > > Below is a sample of the data--5 days' worth of data for 5 different > stocks. The master file is a csv file, and the individual outputs (5 > in this case) should also be csv files. (Obviously, row 0 needs to be > ignored.) The output files should use the ticker symbol as the name > for each file (e.g., AA.csv). The ticker symbol (column 0) should be > stripped off of each line of data, with only the remainder of each row > (date onward being retained) being cumulated for output. > > Please correct me if I'm wrong, but my assumption is that if code > works for these 25 lines of data, the code ought to work as well for > 14,937,606 lines! > > DATA SET D: > __________________________________________________ > > ticker,date,open,high,low,close,volume > AA,2017-06-27,31.6,32.5,31.49,31.63,5463485.0 > AA,2017-06-28,32.1,33.0,31.93,32.95,3764296.0 > AA,2017-06-29,33.11,33.34,32.61,33.18,3730077.0 > AA,2017-06-30,33.16,33.45,32.535,32.65,3014777.0 > AA,2017-07-03,32.94,34.3,32.915,34.02,3112086.0 > AAPL,2017-06-28,144.49,146.11,143.1601,145.83,21915939.0 > AAPL,2017-06-29,144.71,145.13,142.28,143.68,31116980.0 > AAPL,2017-06-30,144.45,144.96,143.78,144.02,22328979.0 > AAPL,2017-07-03,144.88,145.3001,143.1,143.5,14276812.0 > AAPL,2017-07-05,143.69,144.79,142.7237,144.09,20758795.0 > GE,2017-06-28,27.26,27.4,27.05,27.08,30759065.0 > GE,2017-06-29,27.16,27.41,26.79,27.02,36443559.0 > GE,2017-06-30,27.09,27.19,26.91,27.01,25849199.0 > GE,2017-07-03,27.16,27.59,27.06,27.45,20664966.0 > GE,2017-07-05,27.54,27.56,27.23,27.35,21082332.0 > IBM,2017-06-28,155.15,155.55,154.78,155.32,2203062.0 > IBM,2017-06-29,155.35,155.74,153.62,154.13,3245649.0 > IBM,2017-06-30,154.28,154.5,153.14,153.83,3501395.0 > IBM,2017-07-03,153.58,156.025,153.52,155.58,2822499.0 > IBM,2017-07-05,155.77,155.89,153.63,153.67,3558639.0 > T,2017-06-28,37.88,38.065,37.78,37.94,20312146.0 > T,2017-06-29,37.87,37.98,37.62,37.62,23508452.0 > T,2017-06-30,37.73,37.87,37.54,37.73,22303282.0 > T,2017-07-03,37.84,38.13,37.785,38.11,11123146.0 > T,2017-07-05,38.11,38.21,37.85,38.12,19644726.0 > __________________________________________________ > > SUPER thanks in advance for any and all help with this one! > > Harvey > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > -- Devon McCormick, CFA Quantitative Consultant ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
