On 21.9.2016 0:08, Jiri Jetmar wrote:
Independently of that that, things starts to be complicated in the SQL
world with large data when you are submitting e.g. a inner JOIN statement
in a transactional INSERT expression, where the tables are located (the
data) on different nodes, simply because of a single node limit.
Well... there are ways to circumvent this, too. :)
Unfortunately, can't tell much more about that, as it is key concept of
the company where I work right now.
But very large and globally distributed RDBMS nodes, which don't lag
under high stress, are very much so possible!
I would like to work on this task. Therefore smart ideas are highly welcome
! :-)
I am responsible for current SQL indexing code, and Paul is responsible
for current SQL entitystore code.
The indexing code is, unfortunately, quite spaghettified, and I would do
lots of things probably very differently now, than how I did them
several years ago.
I think one strategy to approach would be to re-write whole thing from
scratch, but keeping the main principles:
1. Each property and association should be stored in separate table.
This is required for the very dynamic nature of Zest queries.
This implicates that the whole DB schema will be owned by this
indexing service - it is not compatible with some external tool
generating tables (unless tables are, of course, in this exact required
format).
Example: if you have interface MyEntity { Property<String>
myProp(); Association<SomeOtherEntity> myAsso(); },
That would end up with 3 tables:
1.1. The table for implicit 'identity' property. Current indexing
code has separate DB-specific entity primary key which is of type 'long'
(64bit integer), since that makes foreign key references much faster and
compacter. Not sure if that is good idea anymore.
1.2. The table for 'myProp' property with 2 columns: foreign key
reference to identity table and actual column of type 'text' containing
the contents of 'myProp' property.
1.3. The table for 'myAsso' association with 2 columns: foreign key
reference to identity table of 'MyEntity' and foreign key reference to
identity table of 'SomeOtherEntity'. Obviously this covers the
'ManyAssociation's as well, if they are still present in Zest.
2. The SQL query will then be created by translating Zest query objects
into SQL query objects. You could also just build SQL strings, but it
would be quite nasty and most likely very unmaintaineable task - current
code uses java-sql-generator, which is my library on github, to create
SQL statements using DSL, and not by concatenating strings.
IIRC Zest AND-conditions are SQL INTERSECTIONs of single
property/asso queries, Zest OR-conditions are SQL UNIONs of single
property/asso queries, and Zest NOT-conditions are SQL NOTs of single
property/asso queries.
3. Keeping DB schema in sync with Zest application schema.
This is very tough one - if someone adds/removes
properties/associations, the database schema should be kept up-to-date.
I remember having problems with this one - it also implies that you
have to walk through the *whole* application structure in order to
generate tables for all the properties/assos of all the entities visible
to this indexing service.