Dave,
It was interesting to see your post just after I had installed Octave
(open-source Matlab). The suggestion to re-do the work to be more J-like
is reasonable but perhaps the transition could be eased by retaining some
of the data structure you found useful.
What if you broke your data into a structure of names corresponding to the
nested structure of your data and used something like "(datnms;<data)
lookup 'q1.test1.raw1'" to accomplish this in J? Would that be too
cumbersome? It seems like it's not conceptually too difficult.
I've just been looking at the Octave "save" format - the non-binary one -
and it looks amenable to parsing data into some sort of J nested array.
So, if we have some nested data like this (created in Octave):
q1.test1.calconst= rand(1,5);
q1.test1.raw1= rand(5,5);
q1.test1.raw2= rand(5,5);
q2.test1.calconst= rand(1,5);
q2.test1.raw1= rand(5,5);
q2.test1.raw2= rand(5,5);
It's not too hard to imagine a parallel name structure in J like this:
datnms=.
'q1';('test1';(<'calconst';'raw1';'raw2'));('test2';(<'calconst';'raw1';'raw2'));<'test3';<'calconst';'raw1';'raw2'
] datnms=.
datnms,:'q2';('test1';(<'calconst';'raw1';'raw2'));('test2';(<'calconst';'raw1';'raw2'));<'test3';<'calconst';'raw1';'raw2'
+--+----------------------------+----------------------------+----------------------------+
|q1|+-----+--------------------+|+-----+--------------------+|+-----+--------------------+|
|
||test1|+--------+----+----+|||test2|+--------+----+----+|||test3|+--------+----+----+||
| || ||calconst|raw1|raw2|||| ||calconst|raw1|raw2||||
||calconst|raw1|raw2|||
| || |+--------+----+----+||| |+--------+----+----+|||
|+--------+----+----+||
|
|+-----+--------------------+|+-----+--------------------+|+-----+--------------------+|
+--+----------------------------+----------------------------+----------------------------+
|q2|+-----+--------------------+|+-----+--------------------+|+-----+--------------------+|
|
||test1|+--------+----+----+|||test2|+--------+----+----+|||test3|+--------+----+----+||
| || ||calconst|raw1|raw2|||| ||calconst|raw1|raw2||||
||calconst|raw1|raw2|||
| || |+--------+----+----+||| |+--------+----+----+|||
|+--------+----+----+||
|
|+-----+--------------------+|+-----+--------------------+|+-----+--------------------+|
+--+----------------------------+----------------------------+----------------------------+
With data nested the same way, we could decompose a name like
'q1.test1.calconst' to pull out the corresponding piece of data. Also,
looking at the non-binary save format, we see the Octave data like this:
# Created by Octave 3.6.2, Sun Feb 03 14:14:51 2013 Eastern Standard Time
...
# name: q1
# type: scalar struct
# ndims: 2
1 1
# length: 1
# name: test1
# type: scalar struct
# ndims: 2
1 1
# length: 3
# name: calconst
# type: matrix
# rows: 1
# columns: 5
0.225705 0.940538 0.915680 0.994779 0.084628
# name: raw1
# type: matrix
# rows: 5
# columns: 5
0.799543 0.590760 0.443348 0.241761 0.552149
0.888026 0.616080 0.939466 0.0362028 0.532692
0.416769 0.954700 0.378494 0.598029 0.717124
0.801416 0.723429 0.408911 0.505061 0.553901
0.0218875 0.736022 0.0789700 0.419904 0.002437
# name: raw2
# type: matrix
# rows: 5
# columns: 5
0.542203 0.446674 0.355061 0.870099 0.575727
0.00216731 0.723801 0.276619 0.265488 0.279922
0.695537 0.800476 0.882940 0.927363 0.828965
0.0116777 0.670176 0.731459 0.858518 0.492207
0.348193 0.533488 0.182100 0.551607 0.468953
(I've truncated the precision for this display) which looks like it could
be parsed into a J nested array corresponding to the name structure
outlined above. Interestingly, the nesting in this "save" format doesn't
seem to be made explicit. It looks to be inferred from length of the
higher-level nodes and the spacing between items: the definition of "q2"
follows after six blank lines.
On Sat, Feb 2, 2013 at 11:10 PM, David Porter <[email protected]> wrote:
> The problem I faced in MatLab was to provide temperature data from an
> infra-red camera. The data from the camera was in the Air Force's Standard
> Archive Format(SAF) that allows all the raw data, calibration constants,
> and much ancillary data (such as data units, classification, and data type)
> to be saved in several files. The major advantage of using this approach
> and the SAF files is that the entire experiment may be reproduced at a
> later time to troubleshoot any aspect of the data handling. The major
> disadvantage is that several different files are needed, each with a
> similar but not identical format, and smaller pieces of the data were
> needed by different functions.
>
> I started using MatLab's structure as a way of pulling all the data from
> all the files together under a single variable e.g. q1. Using this
> variable, I could easily provide the entirety of the data to a function.
>
> At times I was only interested in the irradiance to temperature
> calibration constants, while at other times I was interested only in the
> actual raw data. I could separate out the data I needed and provide that
> as an argument to another function by using the hierarchical naming, e.g.
> q1.test1.calconstants, The structure notation is simple to use, requiring
> only assignments to the properly named entity. Recovering the data was as
> easy as typing the same named entity. After using this notation for a
> while, I thought of it more as a database construction and query language.
>
> The need to make a copy of the data comes from assigning the output of a
> function to a name. I would like to assign the variable used in the
> function (say q1) to a more meaningful name (say replication01). As I
> stated above, I could also pass pertinent subsets of data to functions that
> worked on that data without burdening the function with all the other data.
>
> The above are some of the problems that the MatLab Structure solves. Can
> it be done other ways? Certainly. But after I worked with this notation in
> MatLab, I found it to be a valuable way to store a large amount of data
> efficiently.
>
> An approach much like your example is what I was thinking I would have to
> do before I emailed the forum. It may still be where I end up.
>
> Thanks,
> Dave
>
>
> On 2/2/2013 8:28 AM, Raul Miller wrote:
>
>> As has been pointed out, J's locales are not hierarchical.
>>
>> J's boxed structures are hierarchical but are not implicitly named.
>>
>> A question is: why do you need both a hierarchy and a way of making
>> copies of a part of that hierarchy?
>>
>> Why not use a single copy?
>>
>> What problem does the hierarchy solve?
>>
>> It's entirely possible to emulate the hierarchy in J, but there will
>> be performance costs for some operation. In your case you had three
>> operations:
>>
>> Defining an initial value
>> Copying a subtree
>> Accessing a final value
>>
>> I can think of one approach where access is fast, and another approach
>> where copy is fast, for example. But since I do not know why you want
>> this in the first place I do not know if either is appropriate.
>>
>> For example, you might use the letter 'z' to mark hierarchical breaks.
>> Here, you would implement copy using a routine which enumerates your
>> names and which makes fresh copies of them.
>>
>> For example, you might use indirect references to achieve your
>> hierarchical breaks. Here, you would implement access using a routine
>> which visits each locale to find the reference to the next locale.
>>
>> In either case, if you do not like the appearance of the mechanism
>> (and you probably should not find it appealing) you would build
>> routines to hide that appearance from arbitrary code.
>>
>>
> ------------------------------**------------------------------**----------
> For information about J forums see
> http://www.jsoftware.com/**forums.htm<http://www.jsoftware.com/forums.htm>
>
--
Devon McCormick, CFA
^me^ at acm.
org is my
preferred e-mail
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm