Hi,
I'm one of the contributors to JDMB3 serialization implementation.
Actually earlier this year I made it much faster than before (2
orders
of magnitude). And BTW, I'm also a contributor to Kryo and
protostuff-runtime.
I find this discussion very interesting, so let me provide my two
cents as well.
First of all, I just want to mention that while working on improving
JDBM's serialization, I extracted the serialization part of the JDBM
into a dedicated serialization library, which I called Quickser. You
can find it on GitHub: https://github.com/romix/quickser
It is really very fast, often faster than Kryo and protostuff. Since
Quickser contains only serialization-related stuff from
JDBM/MapDB, it
is easier to use it if you just want to add yet another
serialization
method to DM without any DB related functionality.
It could even make sense, if MapDB would use Quickser for
serialization instead of having both DB and serialization related
functionality in one pot.
@Jan: What do you think about it? I understand that you don't like
external dependencies. But Quickser is not really external. It is
more
or less a copy of JDMBs serialization-related classes.
On Wed, Nov 7, 2012 at 9:49 AM, Jan Kotek <[email protected]> wrote:
Hi,
1. DirectMemory could make good use of mapdb to serialize
least
frequently used items to disk and free memory
2. DirectMemory could implement a MapDB disk based store
in addition
to
the bytebuffer and unsafe ones
The only problem may be that MapDB currently does not support
concurrent
transactions (it has only one single global transaction).
Not sure if it could be a problem.
However it implements ConcurrentMap, so it is possible to swap
items
atomically
3. MapDB could take advantage of DM's componentization
approach to
support multiple serializers (we believe each one has its
advantages
in
different scenarios)
MapDB already supports alternative serializers. User can supply
their own on
Map (similar to table) creation.
I would love to integrate stuff from lightning serializer.
4. MapDB could use DM to write items to an off-heap before
writing to
disk (asynchronously) to improve speed
Not sure it would be practical. MapDB already uses memory mapped
files so
effect would be very similar. My tests shows that there is only 50%
performance difference between inMemory store and onDisk store.
Currently MapDB has only heap based inMemory store. But
implementing off
heap memory store is trivial and I will do it soon.
This is very nice to know. Looking forward to see this feature.
May be
you should use DM for it?
5. We could merge our serialization efforts (I believe
lightning is
very fast and worth to be considered) and provide an even
better solution
or two alternative implementations
100% agree. I will check lightning sources and see if I could
contribute my
stuff. MapDB serialization is very space-efficiency oriented and
it can
contribute a lot.
Well, having worked with JDBM's/MapDB's serialization, Kryo and
protostuff, I would say that MapDB's serialization is
space-efficient,
but roughly at the same level as Kryo or a bit worse than latest
versions of Kryo.
IMHO, the biggest advantage of MapDB's serialization is its
speed. It
usually wins against highly optimized versions of Kryo and
protostuff,
even though they use Unsafe tricks and the like. To some extent this
speed improvement can be probably attributed to the simplicity of
MapDB's serialization implementation. It is not very feature
rich, but
very small and simple (just a few classes) and call stacks during
serialization are usually also very short. Probably JIT is able to
optimize and inline much better than in other more complex and
universal frameworks.
My only condition is that lighting is distributed in separate
JAR. I like
minimal dependencies.
In both cases we would be open to contribution in different
forms - just
contributing patches or with you to join us and the ASF as
module or
subproject (the latter options have to undergo a formal vote by
all
project
members, of course) as I strongly believe that merging efforts
would bring
to a better and more complete product.
I would prefer MapDB to stay on GitHub. I find it more
comfortable to use.
JDBM3 (previous version) nearly become ApacheDS subproject, but
on last
moment I decided otherwise.
I strongly agree with Jan here. JDBM/MapDB is used by most people
as a
DB or persistent map.
Its serialization functionality is nice to have, but not the most
important feature of it.
At the same time, for DM such things like off-heap mgmt and
serialization are the most important ones, but persistency is
optional.
Therefore, IMHO both project should remain independent and cooperate
or make use of each other. But they should not be integrated into
one
"megaproject", which can do everything.
-Roman