Re: [sqlite] SQLite clusters?

JFC Morfin Wed, 18 Sep 2013 07:58:06 -0700

At 15:58 17/09/2013, Jason H wrote:

That's the whole point of using SQLite, it's not 'big iron' - it's abunch of iron filings working together.I'm just suggesting the amount of work to get such a thing going isnot that much work, but I wanted to float it here to see if therewere any good reasons why it was a bad idea. :-)


Hi! Folks,

I happen to read this question while I consider a similar idea and Iam blocked because I do not know anything about the SQLitearchitectural behavior. I found a funny name to call this kind ofissue: SQLity, as a portmanteau word for "Swarm QuaLity". How, whenand why a bunch of SQLite modular uses could provide or not a better,simpler, cheaper solution than a big one.

I am not competent to discuss Jason's idea directly, so I wll onlyexplain my idea for you guys to see if there is some form ofconvergence between them (and how to implement mine). From the user'spoint of view I understand that question is the same: there is a bigpiece of cake, will someone swallow it or a swarm eat it faster?

My need is for a 3.0 wiki or "dikty" solution based on MediaWiki("diktyos" in Greek means networked system) using SQLite. A wiki is adatabased centered service, a dikty is a wikipage centered use. Someopf the differences are:

- in wikipedia there is one page per concept so this page must beneutral. In diktypedia there can be an unlimited number so everyonecan speak up his mind.- in wikipedia there is a single big main system with subhosts, indiktypedia there can be an unlimited number of machine and a DNSforming a P&P system.- in wikipedia there is a single back-up, in diktypedia everyonemakes his own back-up and can move his data around the way he want.- in wikipedia there is a big yearly budget, in diktypedia everyonecan participate from his site.- in wikipedia there is a need for controllers, in diktypediaeveryone can create his own commented semanticpedia, in any language,from any opinion.- in wikipedia rights of access are per system, in diktypedia it isthe same, but the systems are micro/individual systems; like blogs.

- in wikipedia there is a blog per mediawiki project

Due to the way MediaWiki is supported and my limited resources, Ithought the best would be swarms of basic MediaWiki modules eachhaving its SQLite small/medium size file (local protection insured)and one can move around and easily back-up and replicate. For exampleI can have my dikitypages duplicated and maintained on my machine, oran USB key.

My problem is the server's architecture and CPU load. Ideally I wouldneed a MediaWiki/PHP machine where I could just load or create a setof diktypages in loading/setting an SQLite file? But I have no ideahow to organize this and how it will work.

If I add what I gather from Jason's question and mine, I would askabout the SQLity idea as: can I use in parallel clusters of multipleSQLite specialized (and networked) tasks (actually for the users theyappears as real [my case] or virtual [as I understand Jason]sub-tasks)? Is that sound or crazy? What are the limits? What are thecons and pros? (in the case of diktypedia the big and most accessedsystems can used other DBMS's).

jfc

Hi, Jason,
It might be that this is a little bit to big for sqLITE.
Maybe a "big iron" database like PostgreSQL or the GreenplumDatabase will fit your requirements better.
Best regards
Markus Schaber

> -----Ursprüngliche Nachricht-----
> Von: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] Im Auftrag von Jason H
> Gesendet: Montag, 16. September 2013 23:04
> An: sqlite-users@sqlite.org
> Betreff: [sqlite] SQLite clusters?
>
> I'm transitioning my job from embedded space to Hadoop space. I was
> wondering if it is possible to come up with a SQLite cluster
> adaptation.
>
> I will give you a crash course in hadoop. Basically we get a very
> large CSV, which is chopped up into 64MB chunks, and distributed to a
> number of nodes. The file is actually replicated 2 times for a total
> of 3 copies of all chunks on the cluster (no chunk is repeatedly
> stored on the same node). Then MapReduce logic is run, and the
> results are combined. Instrumental to this is the keys are returned
> in sorted order.
>
> All of this is done in java (70% slower than C, on average, and with
> some non-trivial start-up cost). Everyone is clamoring for SQL to be
> run on the nodes. Hive attempts to leverage SQL, and is successful to
> some degree. But being able to use Full SQL would be a huge
> improvement. Akin to Hadoop is HBase
>
> HBase is similar with Hadoop, but it approaches things in a more
> conventional columnar format It a copy of "BigTable" form google..
> Here, the notion of "column families" is important because column
> families are files. A row is made up of keys, at leas one column
> family. There is an implied join between the key, and each column
> family. As the table is viewed though, it is void as a join between
> the key and all column families. What denotes a column family (cf) is
> not specified, however the idea is to group columns into cfs by usage.
> That is cf1 is your most commonly needed data, and cfN is the least
> often needed.
>
> HBase is queried by a specialized API. This API is written to work
> over very large datasets, working directly with the data. However not
> all uses of HBase need this. The majority of queries are distributed
> just because they are over a huge dataset, with a modest amount of
> rows returned. Distribution allows for much more paralleled disk
> reading.  For this case, a SQLite cluster makes perfect sense.
>
> Mapping all of this to SQLite, I could see a bit of work could go a
> long way. Column families can be implemented as separate files, which
> are ATTACHed and joined as needed. The most complicated operation is
> a join, where we have to coordinate the list of distinct values of
> the join to all other notes, for join matching. We then have to move
> all of that data to the same node for the join.
>
> The non-data input is a traditional SQL statement, but we will have
> to parse and restructure the statement to join for the needed column
> families. Also needed is a way to ship a row to another server for
> processing.
>
> I'm just putting this out there as me thinking out loud. I wonder how
> it would turn out. Comments?


_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLite clusters?

Reply via email to