On Tue, Apr 6, 2021, 01:28 PM Henry Rich <[email protected]> wrote:
> 'May be'.  Have you actually implemented it?

Yes, I was playing with this idea some time ago. With this approach, an example 
database "sed" from
Jd [1] may be implemented as following (best viewing with monospace font, 
beware line wraps):

NB. r
rno=. 0 4 5 6                  NB. field #0
rname=. <;._1 '  Naples Rome'  NB. field #1
popul=. 0 1000 2000            NB. field #2

NB. s
sno=. 0 31 33 34 35                                    NB. field #3
sname=. <;._1 '  Auto Truck'                           NB. field #4
name=. <;._1 '  Sales Clerical Engineering Marketing'  NB. field #5

NB. e
ename=. <;._1 '  Smith Jones Robinson Jasper Steinberg Rafferty'  NB. field #6

NB. t
tno=. 0 11 12 13 14                   NB. field #7
tname=. <;._1 '  Desk Cabinet Chair'  NB. field #8
tsize=. <;._1 '  small medium'        NB. field #9
tcat=. ' ABR'                         NB. field #a

shape=. (# rno) , (# rname) , (# popul) , (# sno) , (# sname) , (# name) , (# 
ename) , (# tno) , (# tname) , (# tsize) , (# tcat)
db=. 1 $. shape ; (i. # shape) ; 0

NB. tbl:                                   r               s                    
  e           t
NB. col:  0 1 2 3 4 5 6 7 8 9 a            0 1      2      3  4     5           
  6           7  8       9      a
db=. 1 (< 1 1 1 1 1 1 6 0 0 0 0)} db  NB. (4 Naples 1000) (31 Auto  Sales      
) (Rafferty ) (                   )
db=. 1 (< 1 1 1 4 1 4 0 0 0 0 0)} db  NB. (4 Naples 1000) (35 Auto  Marketing  
) (         ) (                   )
db=. 1 (< 2 2 2 2 2 3 2 1 1 1 1)} db  NB. (5 Rome   2000) (33 Truck 
Engineering) (Jones    ) (11 Desk    small  A)
db=. 1 (< 2 2 2 2 2 3 2 4 2 2 3)} db  NB. (5 Rome   2000) (33 Truck 
Engineering) (Jones    ) (14 Cabinet medium R)
db=. 1 (< 2 2 2 2 2 3 4 2 2 2 2)} db  NB. (5 Rome   2000) (33 Truck 
Engineering) (Jasper   ) (12 Cabinet medium B)
db=. 1 (< 2 2 2 2 2 3 5 0 0 0 0)} db  NB. (5 Rome   2000) (33 Truck 
Engineering) (Steinberg) (                   )
db=. 1 (< 3 1 2 1 2 2 0 0 0 0 0)} db  NB. (6 Naples 2000) (31 Truck Clerical   
) (         ) (                   )
db=. 1 (< 3 1 2 3 1 2 1 3 3 1 1)} db  NB. (6 Naples 2000) (34 Auto  Clerical   
) (Smith    ) (13 Chair   small  A)
db=. 1 (< 3 1 2 3 1 2 3 0 0 0 0)} db  NB. (6 Naples 2000) (34 Auto  Clerical   
) (Robinson ) (                   )

Note: leading axis element (IO=0) is reserved for "no relation" case.

> I find it very hard to believe that a J sparse array would be the best
> choice for any database.  It seems to me to:
> 
> * take lots of memory per value

It's true. The sketch above consumes 11 integers to store one end-to-end tuple.

By the way, for sparse array A with element e:
  e=. ISO { A
cases where sizeof(ISO)>sizeof(e) are general. Opposite cases are:
- integer or float sparse vectors (sizeof(ISO)=8, sizeof(e)=8)
- complex sparse vectors and matrices (sizeof(ISO)<=2*8, sizeof(e)=2*8)
- other cases with non-scalar numeric non-boolean values (sizeof(e)>8)

Memory saving appears due to absence of necessity to store and maintain:
- "foreign key" relation columns in each relating table,
- intermediate many-to-many relation tables.

Data fields in Jd sed database occupies 880 bytes, 388 of which (44%) are used 
to store foreign key relation fields.

Due to small data set, indices really consumes 24 bits which fits in single 
integer. But this storage scheme will
require additional pre- and post-processing.

> * have limited or inefficient queries

Queries may be performed by two approaches:
1) operating on index array (4 $. db): operating with columns as integer sets;
2) operating on subarrays (SUB_ISO { db): operating with sparse boolean 
matrices.

Both may benefit from rich J facilities.

Indeed, all kinds of queries are possible.

> * be very slow for updates, as the entire table must be copied

To establish or relax a relation, a single bit must be set or reset at certain 
index (though, it's not really true
in common cases). This can be executed in-place.

References:
[1] https://www.jsoftware.com/jd_tuts.html#sed

-- 
Regards,
Igor


----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to