Re: [basex-talk] BaseX Capacity

Dirk Kirsten Thu, 28 Mar 2013 02:27:11 -0700

Hello Raj,

thanks for your interest in BaseX.

You can see the current upper limits of Basex at [1]. As you can see, the
current upper file size limit is 512GiB per database. However, you can
always distribute your data across several databases as databases in BaseX
are a fairly lightweight concept and you can also access multiple databases
within one XQuery expression. So, theoretically you can save Terabytes of
data.

However, if query execution against such a large database will be efficient
is very difficult to tell. It heavily depends on the type of query you want
to run, but personally I would not expect a blasting performance. But
again, this is very hard to tell.

Scaling out and replication is currently not supported by BaseX. Of course
you can always use some kind of distributed file system to physically
distribute your data, but BaseX itself is not doing this for you. Of
course, you could start several BaseX servers and store certain data at
specific servers, but there will be no synchronization of any kind.
However, we would love to change this and this is actually my current
project.

I gave a short talk about our plans at our user meet-up at XML Prague. You
can see the slides at [2] (hopefully the videos will be there as well any
time soon). So, we are interested in scaling out and replication.
Therefore, I am also very interested in real-world use cases. I would be
very interested if you could tell me more about your specific requirements
(either by private mail or mailing list), so that we in the end will have a
real-world usable solution.

Cheers,
Dirk

[1] http://docs.basex.org/wiki/Statistics
[2] http://files.basex.org/xmlprague2013/

On Tue, Mar 26, 2013 at 9:22 PM, Rajabrata Chaudhuri <[email protected]>wrote:

> Hello,
>
> First I'd like to thank you guys for all your great work on BaseX.  I am
> fairly familiar with XML DBs and have done a significant amount of
> development on top of Mark Logic.  I would like to ask some questions about
> capacity and scalability.  I have reviewed the documentation and see that
> the biggest store is for SDMX @ approximately 8000 GB.  So I am just trying
> to understand what this means better and would appreciate any of your
> expert advice for my questions below:
>
> 1.  Is the expectation that you can query against 8 TB of XML data
> efficiently?
> 2.  My requirements will be to query across probably 24 TB of XML data.
> Do you guys feel this is possible?
> 3.  What is the method to scale horizontally and vertically?  I.E. Would I
> be adding more servers, or starting more instances, etc.?
> 4.  How does high availability work?  I.E. Can I have multiple
> active-active nodes, or should it be active-passive, etc.?
>
> Any help anyone can render is greatly appreciated.
>
> Thanks
> Raj
>
>
>
> _______________________________________________
> BaseX-Talk mailing list
> [email protected]
> https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
>
>

-- 
Dirk Kirsten, BaseX GmbH, http://basex.org
|-- Firmensitz: Blarerstrasse 56, 78462 Konstanz
|-- Registergericht Freiburg, HRB: 708285, Geschäftsführer:
|   Dr. Christian Grün, Alexander Holupirek, Michael Seiferle
`-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22

_______________________________________________
BaseX-Talk mailing list
[email protected]
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

Re: [basex-talk] BaseX Capacity

Reply via email to