Re: Modeling combinations (options and dependencies)

2023-05-19 Thread Peter J. Holzer
On 2023-05-18 19:21:23 +, eacil wrote:
> -
> DATA
> -
> 
> -You have objects that are linked to many tables describing their
> properties, such as an object__tag table.
> -These objects are sold in releases.
> -Thing is, objects can be made of objects. I call them superobjects
> when they assume that container role (as objects, they can be part of
> other superobjects). Multiple releases can be linked to one
> superobject. Yes, it's a parts scenario.
> -Objects can be shuffled in all sorts of unpredictable ways so it's
> useless to think that if one object is contained inside some
> superobject A, if that superobject A will be contained inside another
> superobject B, superobject B will inherit all objects from superobject
> A. In fact, my attempt at a solution will instantiate objects to link
> them to superobjects.
> -In a superobject, multiple objects can be grouped together into a set
> of options. One option can be made of multiple objects (majority of
> the time, just one). You can choose how many options minimum and
> maximum (aka a range) you can choose from a set (majority of the time,
> just one). It's possible to have a NONE choice aka you don't choose
> any option.
> -Any object can have 0 (common), 1 (common), or more (very rare)
> dependencies.
> 
> That's pretty much all.

Not sure if I followed that completely, but it certainly looks like a
quite complicated system of constraints.

> CREATE TABLE pairing (
> superobject_id integer NOT NULL,
> instance_a integer NOT NULL,
> instance_b integer NOT NULL,
> CONSTRAINT pairing__pk PRIMARY KEY (superobject_id,instance_a,instance_b)
> );

So if I understand this correctly, you are computing all the
combinations somewhere outside the database and then filling in the
pairings table with all valid pairs? Doesn't that have the same problem
of possibly generating an exponential number of combinations?

I'm also not sure if this is sufficient. If (A,B),(A,C),(A,D),(B,C),
(B,D),(C,D) are all valid combinations within superobject S, does that really
mean that (A,B,C,D) is a valid combination? I suspect you still have to
validate the results.

If you have to validate the results anyway, maybe you can radically
simplify the filter: Just add one row for each object which can possibly
appear in each super-object (or even: 1 row for each super-object with
an array of objects). Then find all super-objects which can contain all
the objects you are looking for and finally filter those in your
application. (Although I wonder how fast that validation is: That also
looks like it could potentially have exponential runtime)

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Modeling combinations (options and dependencies)

2023-05-19 Thread eacil
Hello. I am struggling for the longest time about data I can't fit into a 
satisfying model. I thought about giving up about this project many times...
Context: it's for a web server so speed is of the essence. It's an amateur 
project so expect cheap server. I am also more a novice than a seasoned user so 
please have mercy on me.

Here we go, I hope I will be able to correctly explain my problem because 
chatgpt failed spectacularly to suggest anything even remotely worthwhile (yes, 
pretty desperate move, I know, but I am a desperate man).

-
DATA
-

-You have objects that are linked to many tables describing their properties, 
such as an object__tag table.
-These objects are sold in releases.
-Thing is, objects can be made of objects. I call them superobjects when they 
assume that container role (as objects, they can be part of other 
superobjects). Multiple releases can be linked to one superobject. Yes, it's a 
parts scenario.
-Objects can be shuffled in all sorts of unpredictable ways so it's useless to 
think that if one object is contained inside some superobject A, if that 
superobject A will be contained inside another superobject B, superobject B 
will inherit all objects from superobject A. In fact, my attempt at a solution 
will instantiate objects to link them to superobjects.
-In a superobject, multiple objects can be grouped together into a set of 
options. One option can be made of multiple objects (majority of the time, just 
one). You can choose how many options minimum and maximum (aka a range) you can 
choose from a set (majority of the time, just one). It's possible to have a 
NONE choice aka you don't choose any option.
-Any object can have 0 (common), 1 (common), or more (very rare) dependencies.

That's pretty much all.
It looks like a hierarchy but it is not really one. It looks like a graph but 
probably not one either.
It's made of multiple concepts that, alone, are difficult enough already, but 
now they are all grouped together and it's going well over my head.
In the end, I think it's about combinations with a bit of hierarchy to make 
everything more painful...

Here is a simple "tree" representation of that data. Think of "( )" as radio 
buttons. Empty linebreaks are separating sets of options. X indicates mandatory 
nodes (unless their parent is not chosen). "(n-m)" represents the min-max range 
for a set of options.

(x) Object 0 (it's the superobject)

(1-1) Object A1
(1-1) Object A2
(1-1) Object A3

(x) Object B1

(1-1) Object C1
│ (x) Object D1
│
│ (0-1) Object E1
│ (0-1) Object E2 + Object E3
│
│ (1-2) Object F1
│ (1-2) Object F2 {requires A2}
│
(1-1) Object C2
│ (x) Object G1
│
│ (x) Object E1

As you can see, a user can choose between three options A, E (with the second 
option being made of two objects and the third option making this set 
optional), two options C, F. If they choose object C2, they gets objects G1 and 
E1 by default.
Choosing object F2 forces you to have both objects C1 and A2. It's ugly to 
represent constraints like that but impossible to draw in 2D properly.

The reason why I said that it wasn't really a hierarchy is that you have three 
groups at "depth 1": object 0, object B, and objects C with their children. 
These three groups are all equal to each others and their position could be 
swapped with each other. Same with the children of objects C. Groups D, E, F, 
could be swapped with each other.

When you buy a release, you have to go from the top to the bottom and go 
through every branch until you reach its end.
Here, you could buy objects 0, A2, B1, C1, D1, E2, E3, F2.
Or 0, A1, B1, C1, D1, F1.
Or 0, A3, B1, C2, G1, E1.

Superobjects don't have much depth to begin with, depth 6 max.
Sets of options per superobject won't go probably above 10, but options 
themselves can go pretty high like 50.
Most objects don't have dependencies to tell you. The average superobject is 
made of a bunch of mandatory objects and few sets of options with few options 
(half of them probably being if you want the object or not).
I know it's vague but the data is unpredictable with lot of exceptions and 
that's why I need flexibility and can't hardcode anything.

-
PURPOSE
-

I don't really care at the moment as of how to render that tree above as it 
doesn't appear to be a big challenge. What I care is building a performant 
search engine to find superobjects (and releases above them) given columns of 
objects (subqueries) selected through joining objects with their related tableS 
such as object__tag. You have to find superobjects you can buy, while 
respecting all those annoying constraints. So, if you search for a superobject 
containing E2 and G1, you cannot find superobject 0 because it's on different 
branches! Same, if you search for A1 and A2, you cannot find superobject 0 
because they are in the same set of