Brett,
Let me clear the important misconception!
Attributing the hard work to me is incorrect. I was on Sedna team once
but had to quit due to some obscure reasons. I still maintain the
connection with team members since they are very nice ones. Also I
follow the project since I am still unable to disconnect myself from
Sedna (though the second anniversary of myself leaving the team is
approaching). I do know the internals quite well.
I wish I have tagged my previous with "unofficial" label or something
like that.
wbr
PS. I am going to write a full reply later. Just wanted to clear
misconception urgently.
2010/7/20 Brett Zamir <bret...@yahoo.com <mailto:bret...@yahoo.com>>
On 7/20/2010 4:47 PM, N. Zavaritsky wrote:
Hi Brett,
First of all let me express appreciation of your great work!
Thank you for your kind encouragement, mejedi!
It sounds sort of fantastik (wow, Firefox! everyone use
Firefox these
days!). Unfortunately Sedna was never designed to be an embedable
database.
Yeah. One other manifestation of this is how the data directory is
not configurable. Since I wanted to include Sedna along with the
extension (its small size making it ideal for inclusion with an
extension, avoiding a need for remote downloads after initial
installation), if I kept it inside the extension, the large
databases you speak of would have been included inside the
extensions folder, meaning the user would lose their data each
time I updated the extension! My solution to this was to copy all
of Sedna into the Firefox profile folder and then use that copy.
This is less than ideal, but it did work.
Sedna launches and manages it's own processes (one master +
one process per db + one process per query session). Though
I'd prefer
for a database that Firefox is using internally to be nicely
wrapped
inside Firefox itself (no external processes, just a
dll/so/dylib loaded
into Firefox process, database code sharing threads with the
browser) a
bunch of external processes is just a minor inconvenience.
However there
is more to come.
Yes, that sounds like it would be nice. If you ever get it built
in this way, I can see about getting our extension's XPCOM code
revised to deal with a DLL/so/dylib file instead. That would be
nice if it could be possible. There might also be a bigger (if
small?) chance of seeing Firefox itself add this as a component
for use by extensions if note websites too.
Embedable database must be lean. For the standalone database
it is ok to
consume 90% of avalible memory. Anyway the entire computer may be
dedicated to the task of running that particular database.
Obviously
embedable database is not the main task running on the
computer, but
just the oposite!
I see... Sure...
Sedna memory usage is configurable but if given less than
100Mb RAM
Sedna slows down considerably.
On the plus side, I have at least made the code asynchronous, so
while it may be eating up RAM and get slowed down if not allowed
enough RAM, at least the browser shouldn't come to a complete
stand-still while it's waiting. :)
Of course the amount of RAM neccessary to
achieve adequite performance depends on the complexity of the
queries
used. It is quite reasonable to expect that queries executed
against the
XML database embeded inside Firefox won't be complex ones so
probably we
could use less memory.
I don't know how quickly complexity translates into RAM, but it
sure would be nice to be able to effectively allow complex
queries, as one advantage of being able to run XQuery statements
in the browser is that a server doesn't have to take the risk of
getting bogged down (as far as DoS or bandwidth costs) by such
requests (and the user doesn't have to wait for the usually
biggest bottleneck of waiting for network delivery of the results).
Finally there are issues with the data file. Well, just one
ISSUE. The
data file size. It starts at 100Mb and grows in 100Mb
increments. These
numbers are configurable as well. But don't expect to tune it
down to
1Mb or so :) The data representation used internally was not
designed
to be space-efficient. Basically Sedna manages storage in 64Kb
blocks.
By design distinct named XML elements must live in distinct
blocks (it
makes queries like doc("aaa")/foo/bar really fast). So even not so
complex XML document can easily consume hundreds of blocks.
There are
certain techniques that helps to reduce the number of blocks
used ([a]
redesigning data to utilize fewer distinct named elements or
[b] storing
documents with similar structure together) but I doubt that
end users
will be willing to learn and apply these techniques.
And one of the big pluses of an XML database as I see it and
understand it is its ability to store hierarchical,
non-predictably structured documents...
And lastly the concept of "distinctly named XML elements living in
distinct blocks" was an oversimplification. It is not just the
element
name that counts but the complete path from the document root. For
instance book/section/title and book/section/figure/title occupy
distinct blocks. You may see Sedna is not great at storing XML
markup
(count the number of say<a>-s in average HTML page; how many<a>-s
share the same path from the document root?).
I see...
Over are the bad news.
Now comes some good ones.
Sedna is continually developed. The idea to refine data
storage hovered
for a long time. I guess we can address the abovementioned
issues in
some time.
That would indeed be quite wonderful...
And I must say the opportunity to get Sedna into Firefox is pretty
exciting! Achieving the really WIDE adoption of XML database
is so much
important for the XML database community.
Yes, that is my hope too. Thank you for doing all of the hard work!
Also, thank you very much for the very informative coverage of the
issues. Although I have been hopeful of producing something
useful, maybe an even larger intention was to demonstrate the
potential or "proof-of-concept" of using native XML databases in
the browser and hopefully get people to consider what a logical
fit an XML database is in the browser (where as I think I
mentioned, anything from XHTML tables to unpredictably
hierarchical XML can be stored).
And feel free to try it out, as the main sample I refer to is not
just an example, but actually lets you store any number of
documents and perform XQueries on those documents (with
syntax-coloring thanks to EditArea and the XQuery plugin provided
by DQ). I'd like to make another sample file which allows
iterating over the current databases and viewing the original XML.
By the way, since Sedna's size is the main reason I chose it (not
knowing enough technically to compare) being that it is hard
enough to get people to install an extension (though restartless
extensions are also becoming possible), can I ask how or why (in
laymen's terms) your implementation is so much smaller, by even
several orders of magnitude from the other XML databases out
there? Although it would be nice to see XQUF and Full-text, etc.
added too, it still seems quite impressive that you can cover
XQuery (and XUpdate) with such a small size!
best wishes
Brett
wbr,
mejedi
On Mon, 2010-07-19 at 22:36 +0800, Brett Zamir wrote:
Hello all!
I've just added an add-on for Firefox which uses Sedna;
I'm calling it
"XDIB" or "XML Database In a Browser". See
https://addons.mozilla.org/en-US/firefox/addon/199900/ .
Basically, this lets any web developer, who is given
permission by the
user, to add XML content to Sedna (databases stored in
Firefox's Profile
folder by default, so that the data will not be lost
whenever the
extension may get updated) and then query it, update it, etc..
Actually, we've basically wrapped every command there is
in Sedna
(though removing "Sedna" from the methods in case this
could become
abstracted--sorry, no XQJ at the moment), though I haven't
really tested
anything besides loading XML and doing XQueries against
data in a
collection.
At the moment, the databases are accessible to any site
which requests
permission, as I wanted this to be the default behavior,
since I think
it should be up to users how they want their data shared,
and not have
their own data be locked in by a particular site, even if
that site
originated the storage of the data. The access is granted
depending on
whether read permissions, insert permissions, etc., are
desired, and I
really need to add the ability to specify which database
one has
permission to access, since for now, it is any one!
(though it does ask
permission at least).
I also hope to make the API available as special protocol
links, so a
link could trigger the view of certain content, and if my
other add-on,
Open URIs (at
https://addons.mozilla.org/en-US/firefox/addon/162154/ )
is used together with this, one could link to locally
stored by default,
but fallback to an online site in case the user is
visiting the site
without this add-on installed. I like the idea of
bookmarks that work
offline, and which could do fancy queries, and trigger a
download of the
data (and subsequent auto-updates) if the data had not yet
been
downloaded. But that's not implemented at the moment...
I paid a friend to implement the C++ code which as part of
an XPCOM
component for Mozilla's Cross-Platform Component Object
Model, among
other things, lets JavaScript communicate with C++ so that
people like
me who do not know C++ can still write Firefox extensions
or the like
which interact with cool tools or libraries like Sedna, so
my apologies
if anything has been lost in translation as far as how the
API is to be
used (though I know it is working for XQueries at
least--though I do
need help on figuring out how Sedna can support UTF-8 if
it uses "const
char" as I understand that to be ASCII only; the component
may need to
be updated to support multi-byte strings).
And since my friend is a Windows guy, we've only compiled
so far in
Windows, but since this is supposed to be cross-platform,
hopefully we
can get this working on other systems as well.
As I mention on the add-on site, the API is of course very
much new (not
to mention there may well be problems in our code), so
while we are
whole-heartedly encouraging experimentation, please do not
use this for
critical content, nor depend on the API remaining frozen.
I have the off-hope that we could get XML databases to be
part of HTML5;
e.g., see the debate at
http://hacks.mozilla.org/2010/06/beyond-html5-database-apis-and-the-road-to-indexeddb/
or
http://hacks.mozilla.org/2010/06/comparing-indexeddb-and-webdatabase/comment-page-1/#comment-95595
. My energy is fairly limited as far as what I can
contribute, but I do
hope to make piecemeal progress on this, and fully welcome
anyone
interested to offer their feedback, improvements, or
whatever... :)
(I've cc'd myself if you wish to get in touch off-list to
indicate you
wish to be kept informed.)
best wishes,
Brett
------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first <http://sprint.com/first> --
http://p.sf.net/sfu/sprint-com-first
_______________________________________________
Sedna-discussion mailing list
Sedna-discussion@lists.sourceforge.net
<mailto:Sedna-discussion@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/sedna-discussion