I will strip your question from the noise.

   A=:'abcdefgh',.'12345678'
   B=:'bef',.'TBB'
   A;B
┌──┬──┐
│a1│bT│
│b2│eB│
│c3│fB│
│d4│  │
│e5│  │
│f6│  │
│g7│  │
│h8│  │
└──┴──┘

   [C=: A([,. ({:"1)@] #^:_1~ e.&:({."1))B
a1 
b2T
c3 
d4 
e5B
f6B
g7 
h8

e.&:({."1)      NB. Indicates in which places first elements from B occur in A
#^:_1~          NB. Expand the elements from left according to the Booleans 
from the right
[,.             NB. Extend A with the result.

You will spend some time to rewrite A and B to items to which you can apply 
this solution.


R.E. Boss



-----Original Message-----
From: Programming <programming-boun...@forums.jsoftware.com> On Behalf Of HH 
PackRat
Sent: vrijdag 27 november 2020 07:59
To: Programming forum <programm...@jsoftware.com>
Subject: [Jprogramming] Merging (amending?) two different data files

Hello again!  Another stock market related question:

I need to merge the data of two different data files on the basis of common 
dates.

QUESTION 1  --  I know that I can use a "for." loop, but I was wondering if 
there was perhaps a better (perhaps faster?), more J-like approach.  The basic 
idea is matching dates in a smaller file against a fuller set of dates in a 
larger file and, if the dates match, to append/merge/amend the value in the 
smaller file in with the data of the larger file for that date (see data 
examples below).

QUESTION 2  --  I don't know how Excel handles J files, but is it better to add 
a column of empty values to Data set A to amend into (with alphabetic letters) 
or to have a column of spaces into which to amend alphabetic letters?

Here are sample data to illustrate what I'm trying to do:

Data set A consists of sequential calendar dates, daily lows, and daily highs.
Data set B consists of dates on which either a top (T) or bottom (B) occurred.
Data set C is the merged/amended result: dates, lows, highs, and T/B's (if any)

Data set A:
2015-01-14,153.74,156.49
2015-01-15,154.16,156.97
2015-01-16,153.82,157.63
2015-01-17,,
2015-01-18,,
2015-01-19,,
2015-01-20,154.03,157.33
2015-01-21,151.07,154.50
2015-01-22,151.76,155.72
2015-01-23,154.89,157.60
2015-01-24,,
2015-01-25,,
2015-01-26,155.77,159.46
2015-01-27,152.59,155.09
2015-01-28,151.55,154.53
2015-01-29,149.52,155.58
2015-01-30,153.04,155.24

Data set B:  [merged/amended with A to form C] 2015-01-16,T 2015-01-21,B 
2015-01-26,T 2015-01-29,B

Data set C:  [final result]
2015-01-14,153.74,156.49,
2015-01-15,154.16,156.97,
2015-01-16,153.82,157.63,T
2015-01-17,,,
2015-01-18,,,
2015-01-19,,,
2015-01-20,154.03,157.33,
2015-01-21,151.07,154.50,B
2015-01-22,151.76,155.72,
2015-01-23,154.89,157.60,
2015-01-24,,,
2015-01-25,,,
2015-01-26,155.77,159.46,T
2015-01-27,152.59,155.09,
2015-01-28,151.55,154.53,
2015-01-29,149.52,155.58,B
2015-01-30,153.04,155.24,

The reason I mentioned a possible speed difference between a "for."
loop and something else is that it's quite possible for Data set B above to 
have large distances between matches which could result in a lot of wasted time 
checking for matches when there were none.  (I chose this example because it 
showed several matches within a small number of dates.  Other data sets could 
be significantly different.)

Even if a "for." loop is perhaps the best approach, I do still need to know an 
answer to the second question--which is more typical for a blank, empty column: 
empty values (such as a:) or single spaces?  In this case, the other possible 
values in this column would be scattered values of "T" (top) and "B" (bottom).

I appreciate any help with this!

Harvey
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to