One other thing -- after sleeping on this, I realized I had two
conflicting views about negative numbers in the stock values you were
working with:

(*) One is that negative numbers may appear in the data.

(*) The other is that negative numbers do not appear in the data.

Obviously, these cannot both be true. But the implementation should be
changed to fit whichever is the correct statement:

If negative numbers appear in the data, the code should be tested
against some negative values and fixed so that it works correctly.

If negative numbers do not appear in the data, the whole elaborate
date cleaning process should be eliminated and instead use y-.'-' to
remove hyphens from the input data before it's used.  But, also, this
would mean that there's no point in boxing the numbers during parsing.
Instead use (__&".;._2@,&',');._2 to convert the comma and linefeed
delimited text to a numeric array. (I hope this makes sense?). Or, if
you prefer: replace the commas with spaces and then use __&".;._2 to
convert the space and newline delimited text to a numeric array
(__&".;._2 ' ' (I. ','=txt)} txt). Or, if you don't mind a little
risk, use ".;._2 on the text once the hyphens have been removed.

I hope this helps,

-- 
Raul

On Thu, Apr 2, 2020 at 4:31 AM HH PackRat <[email protected]> wrote:
>
> On 3/31/20, Raul Miller <[email protected]> wrote:
> > If you have enough memory for the intermediate results, you would have
> > no problems with a file that large. You need an order of magnitude
> > more memory for intermediate results than the raw data, though.
>
> I have a desktop PC with 12 GB memory and 2 TB + 1 TB hard drives
> (although, due to extensive downloading, I usually have less than 10
> GB of disk space available at any given time (I have to keep
> transferring downloaded files that I want to permanently save to
> DVD-ROM).  [I also recently purchased a 12 TB backup hard drive to
> help alleviate this problem, but I have not yet installed it.]  I
> estimate that the maximum accumulated textual data for any single
> stock is approximately 895 KB, but most stocks seem to be less than
> this (often considerably less).  I think this setup meets your
> criteria above.
>
> > Me, if I was working with something that big, I'd probably break it
> > into pieces first, textually, before trying to process it numerically.
>
> That's exactly what I'd like the J program to do.  It doesn't need to
> do any numerical processing at all (except maybe counting where to
> start the next set of data to output).  It just needs to start at the
> next starting point in the file, keep accumulating rows of data until
> it reaches a *different* stock symbol, and then output the accumulated
> rows to a file with the name of the stock symbol.  Then repeat the
> same process for the next stock in the file.  I presume that I could
> always attempt to do this with a for loop, if I had to.  (It was
> merely a great convenience to get rid of the first column of the
> accumulated data before outputting the file.  It has to be done
> sometime, and it's easier to do it *before* outputting and saving.
> Otherwise, it would have to be done separately to 3,000 files!)
>
> > Like, maybe use the first column as a file name (discarding that
> > column from the intermediate files
>
> That's what I want the J program to do.  There are about 3,000 stocks,
> and I'm not about to manually put them into separate files, which the
> computer can do far faster than I.   As I've said, my goal is to
> transfer the separated 3,000 files to a DVD-ROM that I can, first of
> all, pop into my PC whenever I might want to do some analysis on one
> or more  files (without constantly taking up needed disk space) and,
> secondly, that I can share with market friends who might be
> interested.
>
> I hope this clarifies my intents.  Thanks for your input and
> subsequent efforts, Raul!
>
> Harvey
>
>
> P.S.  I neglected to clarify my J background in my initial message(s).
> Although I began learning and writing J back in 2006, I still consider
> myself a "beginner+" or "level 1+" (or perhaps somewhat beyond that).
> I'm capable of using approximately 86 primitives (including counting
> some of the dyadic meanings) out of the approximately 120 (I think) J
> primitives.  *Personally*, I try to avoid terseness in favor of
> clarity where possible in my explicit programming.  (I know that
> terseness is a valued skill for probably all members of this group
> except me!)  For me, if I want to alter code or borrow code that I
> wrote, say, 3-5 or more years ago, I do not want to spend a lot of
> time trying to figure out not only what I did but, more importantly,
> *why* I did it.  I try to write as much heavily commented "vertical"
> (rather than "horizontal") code as possible.  (This also helps others
> who may not know J, with whom I may share some code, to have a better
> understanding of what my code does.)  Programming in J (which has
> become my primary language, although I'm still conversant in varieties
> of BASIC) is an enjoyable hobby for me.  [I'm 74 and have been retired
> from librarianship (and, much earlier, elementary education) for more
> than 10 years now (and retired 2 years ago from regular playing of
> church pipe organs after 50 years).]  I'm not writing production code
> or writing code under deadlines like (I assume) many others.  I write
> programs when I need to do something useful with various kinds of data
> I have (blood sugar charts for my doctor, massaging stock market data,
> doing Gann-related market calculations, etc.).  Lately, that's been
> maybe 1 or 2 *new* programs in a year (or, more often, altering or
> upgrading previous programs I've written).  On the other hand, I
> heavily use a couple of my older programs.  But that doesn't stop me
> from seeking to learn new things that I might be able to apply to my
> many interests.  Frankly, I just don't have time to read *all* the J
> messages in addition to all my other email messages.  But I definitely
> DO read messages that touch any of my areas of interest.  That's why,
> for example, I was VERY interested in the discussion and especially J
> coding concerning music recently.  And, of course, I read *every*
> message that was written in reply to my recent "how-to" requests.  My
> slowness in replying to a couple is because I have to "un-terse" them
> so that I understand what's happening and perhaps learn new ways of
> doing things in J.  An example was ";:", where I had never considered
> why or how to use it previously.  That was new learning for me.  Of
> course, there may still be more such new learning awaiting me as I
> unravel more code!
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to