I requested the data sets from Harvey, the sp1500 (full) dataset had 3
invisible chars in front of “Date”, so with a “printable” filter verb these
could be removed.
Hopefully Harvey’s data will now work. I have not yet heard back… my findings
below.
Henry and you were right & each {.b would have shown 7 (not 4) in the first
cell.
So the header record differs in the 2 datasets, shown here… aaa is the short
data set, bbb is the full data set.
aa=. ',' readdsv (jpath '\Users\rob\jwork\test1.csv')
a=. removedoublequotes each aa
3{.a
┌──────┬─────┬──────┬──────┬──────┬──────┬────────┬──────┐
│Date │Price│Open │High │Low │Vol. │Change %│ │
├──────┼─────┼──────┼──────┼──────┼──────┼────────┼──────┤
│May 22│ 2020│670.21│668.40│670.41│665.21│- │0.23% │
├──────┼─────┼──────┼──────┼──────┼──────┼────────┼──────┤
│May 21│ 2020│668.66│672.93│675.06│666.13│- │-0.70%│
└──────┴─────┴──────┴──────┴──────┴──────┴────────┴──────┘
a.&i. each {.a
┌─────────────┬─────────────────┬──────────────┬──────────────┬──────────┬─────────────┬───────────────────────────┬┐
│68 97 116 101│80 114 105 99 101│79 112 101 110│72 105 103 104│76 111 119│86
111 108 46│67 104 97 110 103 101 32 37││
└─────────────┴─────────────────┴──────────────┴──────────────┴──────────┴─────────────┴───────────────────────────┴┘
NB. All looks OK so far … ‘Date’ is ASCII 68 97 116 101
a.{~ 68 97 116 101
Date
bb=. ',' readdsv (jpath '\Users\rob\jwork\sp1500.csv')
b=. removedoublequotes each bb
3{.b
┌──────┬─────┬──────┬──────┬──────┬──────┬────────┬──────┐
│Date │Price│Open │High │Low │Vol. │Change %│ │
├──────┼─────┼──────┼──────┼──────┼──────┼────────┼──────┤
│May 22│ 2020│670.21│668.40│670.41│665.21│- │0.23% │
├──────┼─────┼──────┼──────┼──────┼──────┼────────┼──────┤
│May 21│ 2020│668.66│672.93│675.06│666.13│- │-0.70%│
└──────┴─────┴──────┴──────┴──────┴──────┴────────┴──────┘
a.&i. each {.b
┌─────────────────────────┬─────────────────┬──────────────┬──────────────┬──────────┬─────────────┬───────────────────────────┬┐
│239 187 191 68 97 116 101│80 114 105 99 101│79 112 101 110│72 105 103 104│76
111 119│86 111 108 46│67 104 97 110 103 101 32 37││
└─────────────────────────┴─────────────────┴──────────────┴──────────────┴──────────┴─────────────┴───────────────────────────┴┘
The first cell contains 3 leading unprintable chars, removed with a ‘printable’
filter function such as:
printable =: verb define
y #~ (32&<: *.127&>:) a. i. y
)
>0{{.b
Date
a. i. >0{{.b
239 187 191 68 97 116 101
printable >0{{.b
Date
a. i. printable >0{{.b
68 97 116 101
This filters out the range correctly, and requires printable prior to calling
toupper.
if. 'DATE' -: toupper printable (> 0 { ({. a)) do. a=. }. a end.
Harvey should confirm, but I anticipate this is solved.
…/Rob
> On 26 May 2020, at 11:49 pm, Raul Miller <[email protected]> wrote:
>
> You did not show the shapes of the data offending label for the
> truncated 10 year case.
>
> Shapes have to match for -: to match.
>
> Shapes can be different because if you have 1 dimension(s), or if you
> have invisible characters.
>
> Take care,
>
> --
> Raul
>
> On Tue, May 26, 2020 at 4:53 AM HH PackRat <[email protected]> wrote:
>>
>> On 5/25/20, Henry Rich <[email protected]> wrote:
>>> You used {: in the last line. Try it with {. .
>>>
>>> and on 5/25/20, 'robert therriault' via Programming
>>> <[email protected]> wrote:
>>> I noticed in the single line test that you used {: a and not {. a
>>>
>>> and on 5/25/20, bill lam <[email protected]> wrote:
>>> ... Maybe there are some typo such as {: instead of {. inside you code .
>>
>> Thanks for your eagle eyes in catching that typo! I had an older
>> remarked (NB.) line immediately above this line of code that had the
>> {:a which I visually copied instead of the correct {.a that I had used
>> everywhere else.
>>
>> Unfortunately, making this change did NOT change the results for the
>> full 10-year data. (It did work for the 10-day test case.) I have
>> no idea why this difference should exist. (The test case is the
>> column header row plus the first 10 days of the full 10-year file.) I
>> scanned the 10-year data, but nothing stood out as an anomaly. The
>> ONLY difference I noticed in the running of the J program is that the
>> initial boxing looks slightly different for the two sets of data.
>> (The data is read into file aa by the dsv routine in J.) I have no
>> idea why the two sets of data should look slightly different since the
>> data is the same, except for quantity. The difference is in the
>> display of the headers in the full data. (I tried my best to make
>> these look like the originals. It's very hard with a proportional
>> font.)
>>
>> Here is what a truncated output looks like for the 10-day test data:
>>
>> 3 {. aa
>> ┌─────┬─────┬─────┬─────┬─────┬──────┬───────┬──────┐
>> │ "Date" │ "Price"│"Open" │ "High" │"Low" │ "Vol." │"Change %"│
>> │
>> ├─────┼─────┼─────┼─────┼─────┼──────┼───────┼──────┤
>> │"May 22│ 2020" │"670.21"│"668.40"│"670.41"│"665.21"│"-"
>> │"0.23%" │
>> ├─────┼─────┼─────┼─────┼──────┼─────┼───────┼──────┤
>> │"May 21│ 2020" │"668.66"│"672.93"│"675.06"│"666.13"│"-"
>> │"-0.70%"│
>> └─────┴─────┴─────┴─────┴──────┴─────┴───────┴──────┘
>> 3 {. a
>> ┌─────┬───┬─────┬────┬────┬────┬──────┬────┐
>> │Date │Price│Open │High │Low │Vol. │Change %│ │
>> ├─────┼───┼─────┼────┼────┼────┼──────┼────┤
>> │May 22 │ 2020│670.21│668.40│670.41│665.21│- │0.23% │
>> ├─────┼───┼─────┼────┼────┼────┼──────┼────┤
>> │May 21 │ 2020│668.66│672.93│675.06│666.13│- │-0.70%│
>> └─────┴───┴─────┴────┴────┴────┴──────┴────┘
>> DATE
>> 1 <-------- match is TRUE and column header row is deleted below
>> 3 {. a
>> ┌─────┬───┬────┬────┬─────┬────┬┬─────┐
>> │May 22│ 2020│670.21│668.40│670.41│665.21│-│0.23% │
>> ├─────┼───┼────┼────┼─────┼────┼┼─────┤
>> │May 21│ 2020│668.66│672.93│675.06│666.13│-│-0.70%│
>> ├─────┼───┼────┼────┼─────┼────┼┼─────┤
>> │May 20│ 2020│673.36│670.68│675.42│670.26│-│1.73% │
>> └─────┴───┴────┴────┴─────┴────┴┴─────┘
>>
>>
>> And here is what a truncated output looks like for the full 10-year data:
>>
>> 3 {. aa
>> ┌──────┬────┬──────┬─────┬─────┬──────┬────────┬──────┐
>> │"Date"│"Price"│"Open" │"High" │"Low" │"Vol." │"Change %"│
>> │
>> ├──────┼────┼──────┼─────┼─────┼──────┼────────┼──────┤
>> │"May 22 │ 2020" │"670.21"│"668.40"│"670.41"│"665.21"│"-"
>> │"0.23%" │
>> ├──────┼────┼──────┼─────┼─────┼──────┼────────┼──────┤
>> │"May 21 │ 2020" │"668.66"│"672.93"│"675.06"│"666.13"│"-"
>> │"-0.70%"│
>> └──────┴────┴──────┴─────┴─────┴──────┴────────┴──────┘
>> 3 {. a
>> ┌─────┬───┬-─-─-─┬─────┬────┬────┬──────┬────┐
>> │Date│Price│Open │High │Low │Vol. │Change %│ │
>> ├─────┼───┼-─-─-─┼─────┼────┼────┼──────┼────┤
>> │May 22 │ 2020│670.21│668.40│670.41│665.21│- │0.23% │
>> ├─────┼───┼-─-──-┼─────┼────┼────┼──────┼────┤
>> │May 21 │ 2020│668.66│672.93│675.06│666.13│- │-0.70%│
>> └─────┴───┴─-─-─-┴─────┴────┴────┴──────┴────┘
>> DATE
>> 0 <-------- match is FALSE and column header row is not deleted below
>> 3 {. a
>> ┌─────┬───┬-─-─-─┬─────┬────┬────┬──────┬────┐
>> │Date│Price│Open │High │Low │Vol. │Change %│ │
>> ├─────┼───┼-─-─-─┼─────┼────┼────┼──────┼────┤
>> │May 22 │ 2020│670.21│668.40│670.41│665.21│- │0.23% │
>> ├─────┼───┼-─-──-┼─────┼────┼────┼──────┼────┤
>> │May 21 │ 2020│668.66│672.93│675.06│666.13│- │-0.70%│
>> └─────┴───┴─-─-─-┴─────┴────┴────┴──────┴────┘
>>
>> I'm curious why the headers are completely in sync with the data box
>> shapes when using the test data but out of sync with the box shapes
>> when using the full data (as well as having a match that is false
>> rather than true).
>>
>> Here is the very brief code that I've been using for testing:
>>
>> NB. ci2.ijs
>> require 'files stdlib'
>> require 'tables\dsv'
>> root=: '!user\......' NB. wherever you wish
>>
>> NB. syntax: ci2 'datafilename'
>> ci2=: 3 : 0
>> aa=. ',' readdsv (jpath root,y)
>> smoutput 3 {. aa
>> a=. removedoublequotes each aa
>> smoutput 3 {. a
>> smoutput toupper (> 0 { ({. a))
>> smoutput 'DATE' -: toupper (> 0 { ({. a))
>> if. 'DATE' -: toupper (> 0 { ({. a)) do. a=. }. a end.
>> smoutput 3 {. a
>> )
>> NB. ========================
>> NB. adapted from source: "Special Matrices & Lists" in "Phrases"
>> NB. [original: removeblanks]
>> removedoublequotes=: -.@('"'&E.) # ]
>> NB. ========================
>>
>> I would be happy to share the test file and full file via email with
>> anyone who is interested in figuring out this puzzle. The test file
>> is 699 B, and the full file is 159 KB.
>>
>> I'm eager to find out why the test file works but the full file
>> doesn't (even though the test file is the first 11 rows of the full
>> file).
>>
>> Harvey
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm