I will strip your question from the noise.
A=:'abcdefgh',.'12345678'
B=:'bef',.'TBB'
A;B
┌──┬──┐
│a1│bT│
│b2│eB│
│c3│fB│
│d4│ │
│e5│ │
│f6│ │
│g7│ │
│h8│ │
└──┴──┘
[C=: A([,. ({:"1)@] #^:_1~ e.&:({."1))B
a1
b2T
c3
d4
e5B
f6B
g7
h8
e.&:({."1) NB. Indicates in which places first elements from B occur in A
#^:_1~ NB. Expand the elements from left according to the Booleans
from the right
[,. NB. Extend A with the result.
You will spend some time to rewrite A and B to items to which you can apply
this solution.
R.E. Boss
-----Original Message-----
From: Programming <[email protected]> On Behalf Of HH
PackRat
Sent: vrijdag 27 november 2020 07:59
To: Programming forum <[email protected]>
Subject: [Jprogramming] Merging (amending?) two different data files
Hello again! Another stock market related question:
I need to merge the data of two different data files on the basis of common
dates.
QUESTION 1 -- I know that I can use a "for." loop, but I was wondering if
there was perhaps a better (perhaps faster?), more J-like approach. The basic
idea is matching dates in a smaller file against a fuller set of dates in a
larger file and, if the dates match, to append/merge/amend the value in the
smaller file in with the data of the larger file for that date (see data
examples below).
QUESTION 2 -- I don't know how Excel handles J files, but is it better to add
a column of empty values to Data set A to amend into (with alphabetic letters)
or to have a column of spaces into which to amend alphabetic letters?
Here are sample data to illustrate what I'm trying to do:
Data set A consists of sequential calendar dates, daily lows, and daily highs.
Data set B consists of dates on which either a top (T) or bottom (B) occurred.
Data set C is the merged/amended result: dates, lows, highs, and T/B's (if any)
Data set A:
2015-01-14,153.74,156.49
2015-01-15,154.16,156.97
2015-01-16,153.82,157.63
2015-01-17,,
2015-01-18,,
2015-01-19,,
2015-01-20,154.03,157.33
2015-01-21,151.07,154.50
2015-01-22,151.76,155.72
2015-01-23,154.89,157.60
2015-01-24,,
2015-01-25,,
2015-01-26,155.77,159.46
2015-01-27,152.59,155.09
2015-01-28,151.55,154.53
2015-01-29,149.52,155.58
2015-01-30,153.04,155.24
Data set B: [merged/amended with A to form C] 2015-01-16,T 2015-01-21,B
2015-01-26,T 2015-01-29,B
Data set C: [final result]
2015-01-14,153.74,156.49,
2015-01-15,154.16,156.97,
2015-01-16,153.82,157.63,T
2015-01-17,,,
2015-01-18,,,
2015-01-19,,,
2015-01-20,154.03,157.33,
2015-01-21,151.07,154.50,B
2015-01-22,151.76,155.72,
2015-01-23,154.89,157.60,
2015-01-24,,,
2015-01-25,,,
2015-01-26,155.77,159.46,T
2015-01-27,152.59,155.09,
2015-01-28,151.55,154.53,
2015-01-29,149.52,155.58,B
2015-01-30,153.04,155.24,
The reason I mentioned a possible speed difference between a "for."
loop and something else is that it's quite possible for Data set B above to
have large distances between matches which could result in a lot of wasted time
checking for matches when there were none. (I chose this example because it
showed several matches within a small number of dates. Other data sets could
be significantly different.)
Even if a "for." loop is perhaps the best approach, I do still need to know an
answer to the second question--which is more typical for a blank, empty column:
empty values (such as a:) or single spaces? In this case, the other possible
values in this column would be scattered values of "T" (top) and "B" (bottom).
I appreciate any help with this!
Harvey
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm