At 15:58 17/09/2013, Jason H wrote:
That's the whole point of using SQLite, it's not 'big iron' - it's a bunch of iron filings working together. I'm just suggesting the amount of work to get such a thing going is not that much work, but I wanted to float it here to see if there were any good reasons why it was a bad idea. :-)

Hi! Folks,

I happen to read this question while I consider a similar idea and I am blocked because I do not know anything about the SQLite architectural behavior. I found a funny name to call this kind of issue: SQLity, as a portmanteau word for "Swarm QuaLity". How, when and why a bunch of SQLite modular uses could provide or not a better, simpler, cheaper solution than a big one.

I am not competent to discuss Jason's idea directly, so I wll only explain my idea for you guys to see if there is some form of convergence between them (and how to implement mine). From the user's point of view I understand that question is the same: there is a big piece of cake, will someone swallow it or a swarm eat it faster?

My need is for a 3.0 wiki or "dikty" solution based on MediaWiki ("diktyos" in Greek means networked system) using SQLite. A wiki is a databased centered service, a dikty is a wikipage centered use. Some opf the differences are:

- in wikipedia there is one page per concept so this page must be neutral. In diktypedia there can be an unlimited number so everyone can speak up his mind. - in wikipedia there is a single big main system with subhosts, in diktypedia there can be an unlimited number of machine and a DNS forming a P&P system. - in wikipedia there is a single back-up, in diktypedia everyone makes his own back-up and can move his data around the way he want. - in wikipedia there is a big yearly budget, in diktypedia everyone can participate from his site. - in wikipedia there is a need for controllers, in diktypedia everyone can create his own commented semanticpedia, in any language, from any opinion. - in wikipedia rights of access are per system, in diktypedia it is the same, but the systems are micro/individual systems; like blogs.
- in wikipedia there is a blog per mediawiki project

Due to the way MediaWiki is supported and my limited resources, I thought the best would be swarms of basic MediaWiki modules each having its SQLite small/medium size file (local protection insured) and one can move around and easily back-up and replicate. For example I can have my dikitypages duplicated and maintained on my machine, or an USB key.

My problem is the server's architecture and CPU load. Ideally I would need a MediaWiki/PHP machine where I could just load or create a set of diktypages in loading/setting an SQLite file? But I have no idea how to organize this and how it will work.

If I add what I gather from Jason's question and mine, I would ask about the SQLity idea as: can I use in parallel clusters of multiple SQLite specialized (and networked) tasks (actually for the users they appears as real [my case] or virtual [as I understand Jason] sub-tasks)? Is that sound or crazy? What are the limits? What are the cons and pros? (in the case of diktypedia the big and most accessed systems can used other DBMS's).

jfc


Hi, Jason,
It might be that this is a little bit to big for sqLITE.
Maybe a "big iron" database like PostgreSQL or the Greenplum Database will fit your requirements better.
Best regards
Markus Schaber



> -----Ursprüngliche Nachricht-----
> Von: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] Im Auftrag von Jason H
> Gesendet: Montag, 16. September 2013 23:04
> An: sqlite-users@sqlite.org
> Betreff: [sqlite] SQLite clusters?
>
> I'm transitioning my job from embedded space to Hadoop space. I was
> wondering if it is possible to come up with a SQLite cluster
> adaptation.
>
> I will give you a crash course in hadoop. Basically we get a very
> large CSV, which is chopped up into 64MB chunks, and distributed to a
> number of nodes. The file is actually replicated 2 times for a total
> of 3 copies of all chunks on the cluster (no chunk is repeatedly
> stored on the same node). Then MapReduce logic is run, and the
> results are combined. Instrumental to this is the keys are returned
> in sorted order.
>
> All of this is done in java (70% slower than C, on average, and with
> some non-trivial start-up cost). Everyone is clamoring for SQL to be
> run on the nodes. Hive attempts to leverage SQL, and is successful to
> some degree. But being able to use Full SQL would be a huge
> improvement. Akin to Hadoop is HBase
>
> HBase is similar with Hadoop, but it approaches things in a more
> conventional columnar format It a copy of "BigTable" form google..
> Here, the notion of "column families" is important because column
> families are files. A row is made up of keys, at leas one column
> family. There is an implied join between the key, and each column
> family. As the table is viewed though, it is void as a join between
> the key and all column families. What denotes a column family (cf) is
> not specified, however the idea is to group columns into cfs by usage.
> That is cf1 is your most commonly needed data, and cfN is the least
> often needed.
>
> HBase is queried by a specialized API. This API is written to work
> over very large datasets, working directly with the data. However not
> all uses of HBase need this. The majority of queries are distributed
> just because they are over a huge dataset, with a modest amount of
> rows returned. Distribution allows for much more paralleled disk
> reading.  For this case, a SQLite cluster makes perfect sense.
>
> Mapping all of this to SQLite, I could see a bit of work could go a
> long way. Column families can be implemented as separate files, which
> are ATTACHed and joined as needed. The most complicated operation is
> a join, where we have to coordinate the list of distinct values of
> the join to all other notes, for join matching. We then have to move
> all of that data to the same node for the join.
>
> The non-data input is a traditional SQL statement, but we will have
> to parse and restructure the statement to join for the needed column
> families. Also needed is a way to ship a row to another server for
> processing.
>
> I'm just putting this out there as me thinking out loud. I wonder how
> it would turn out. Comments?

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to