On 04/06/13 17:37, Simon Slavin wrote:
On 4 Jun 2013, at 3:09pm, Eleytherios Stamatogiannakis <est...@gmail.com> wrote:
Is there any way to go beyond the SQLITE_MAX_ATTACHED limit for *read only*
attached DBs?
See section 11 of
<http://www.sqlite.org/limits.html>
It's a 64-bit value, and two bits are already taken up.
Yes i have seen it in SQLite's code. I considered changing it to a
bitfield, but the problem is that this long int is used in various other
places in SQLite's internals for transaction' metadata.
You can attach databases, copy data from them to the main database, then detach
those and attach some others. Or you can create a hierarchy of shards (each of
62 shards can point to up to 62 others). Or you can rewrite your code so it
never uses more than 62 shards no matter how many nodes are available.
Attaching and detaching is only useful when materializing the shards
into a single table:
create table T
attach T1
insert into T select * from T1.T
detach T1
attach T2
insert into T select * from T2.T
...
How would a hierarchy of shards work? You cannot attach a DB onto
another attached DB for the hierarchy idea to work.
Also, only using 62 shards at a time can be very constrained/slow in our
use case (see below).
None of them good solutions, I'm afraid.
Yes :-(.
Also is there anyway for SQLite to create an automatic index on a view (or
Virtual Table), without having to first materialize the view (or VT)?
I believe that SQLite needs the data to be in one place (i.e. at least a
virtual table) for the indexing routine to work.
We have tried with both views and VTs but SQLite does not create
automatic indexes on them at all. So right now, to be able to have
automatic indexes from SQLite's side we materialize all Virtual Tables
into plain tables:
create temp table T as select * from UnionAllVT1;
create temp table G as select * from UnionAllVT2;
...
which doubles our I/O to process a single sharded table.
- 1 full read + 1 full write of all data to materialize the UnionAllVT
into a plain table.
- 1 full read + 1 full write of the data in the materialized table to
create the automatic index.
It would be very nice if the automatic index could be created directly
from the UnionAllVT, but we haven't found a way to do it.
If you're willing to put a bit of SQLite-only effort in, you could implement
your own virtual table implementation that consulted data on each of your
nodes. This would be quite highly customised for your own application's
requirements but it would mean you didn't have to do any attaching or detaching
at all. Your SQLite API calls could address your data as if it was all in one
database file but SQLite would understand how data is partitioned between nodes
and automatically gather it from all the necessary nodes.
We already have done this (creating VTs is very easy in madIS [*]). We
have the UnionALL virtual table that scans over all the DB shards.
Above UnionALL VT only supports scans, and it is only used to
materialize the shards into a regular table. It would be very costly
having per shard indexes, because each Filter on the UnionALL VT would
need to be passed to all of the shards. A single automatic index, works
best.
Another way to do it would be to implement your own VFS which would distribute
over the nodes not at the row level but as if they were all one huge storage
medium (i.e. like a RAID).
Each shard that we use is already a self contained SQLite DB. We would
need to change our whole approach to convert to a disk page based
sharding approach.
Thanks for your ideas.
l.
[*]
https://code.google.com/p/madis/source/browse/src/functions/vtable/unionalldb.py
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users