Hi Roman,

your patch saved me lot of headaches and was very welcomed.

I am not using Quickser becouse serialization in MapDB is still evolving rapidly. For example I made some refactoring to make serialization less dependant on other JDBM classes. I also have plan to use some stuff from Kryo and Lighting (unsafe ops, bytecode generators). And Quickser did not had much updates since it forked.

Obviously it is more comfortable for me if serialization framework stays inside MapDB. It is critical part since in database we care about long term persistence. But on other side I would love if somebody would took over this part. I have no problem with extracting serialization to separate project, but I need to see that this fork is active and can evolve on its own.

I hoped I could use Lightning, Kryo or other framework developed as part of DirectMemory. But there seems to be conception difference. Kryo and Lightning seems to be more like 'serialization framework'; it has bunch of serializers (for numbers, dates...) and you should choose one which suits you best.

But MapDB should 'just work' without additional configuration. So I need universal serialization; it should turn any object into bytes (similar to Java Serialization or XStream). Also I want it to mimic standard Java Serialization (Serializable marker interface, Externalizable, writeExternal methods... etc).

So for now I will investigate if I can patch Lighting to support my needs. If not I will take parts I like and integrate it into MapDB.

Jan

On 07/11/12 09:30, Roman Levenstein wrote:
Hi,

I'm one of the contributors to JDMB3 serialization implementation.
Actually earlier this year I made it much faster than before (2 orders
of magnitude). And BTW, I'm also a contributor to Kryo and
protostuff-runtime.

I find this discussion very interesting, so let me provide my two cents as well.

First of all, I just want to mention that while working on improving
JDBM's serialization, I extracted the serialization part of the JDBM
into a dedicated serialization library, which I called Quickser. You
can find it on GitHub: https://github.com/romix/quickser
It is really very fast, often faster than Kryo and protostuff. Since
Quickser contains only serialization-related stuff from JDBM/MapDB, it
is easier to use it if you just want to add yet another serialization
method to DM without any DB related functionality.

It could even make sense, if MapDB would use Quickser for
serialization instead of having both DB and serialization related
functionality in one pot.

@Jan: What do you think about it? I understand that you don't like
external dependencies. But Quickser is not really external. It is more
or less a copy of JDMBs serialization-related classes.

On Wed, Nov 7, 2012 at 9:49 AM, Jan Kotek <[email protected]> wrote:
Hi,


     1. DirectMemory could make good use of mapdb to serialize least
     frequently used items to disk and free memory
     2. DirectMemory could implement a MapDB disk based store in addition
to
     the bytebuffer and unsafe ones
The only problem may be that MapDB currently does not support concurrent
transactions (it has only one single global transaction).
Not sure if it could be a problem.

However it implements ConcurrentMap, so it is possible to swap items
atomically


     3. MapDB could take advantage of DM's componentization approach to
     support multiple serializers (we believe each one has its advantages
in
     different scenarios)
MapDB already supports alternative serializers. User can supply their own on
Map (similar to table) creation.
I would love to integrate stuff from lightning serializer.


     4. MapDB could use DM to write items to an off-heap before writing to
     disk (asynchronously) to improve speed
Not sure it would be practical. MapDB already uses memory mapped files so
effect would be very similar. My tests shows that there is only 50%
performance difference between inMemory store and onDisk store.

Currently MapDB has only heap based inMemory store. But implementing off
heap memory store is trivial and I will do it soon.
This is very nice to know. Looking forward to see this feature. May be
you should use DM for it?

     5. We could merge our serialization efforts (I believe lightning is
very  fast and worth to be considered) and provide an even better solution
or two alternative implementations

100% agree. I will check lightning sources and see if I could contribute my
stuff. MapDB serialization is very space-efficiency oriented and it can
contribute a lot.
Well, having worked with JDBM's/MapDB's serialization, Kryo and
protostuff, I would say that MapDB's serialization is space-efficient,
but roughly at the same level as Kryo or a bit worse than latest
versions of Kryo.

IMHO, the biggest advantage of MapDB's serialization is its speed. It
usually wins against highly optimized versions of Kryo and protostuff,
even though they use Unsafe tricks and the like. To some extent this
speed improvement  can be probably attributed to the  simplicity of
MapDB's serialization implementation. It is not very feature rich, but
very small and simple (just a few classes) and call stacks during
serialization are usually also very short. Probably JIT is able to
optimize and inline much better than in other more complex and
universal frameworks.

My only condition is that lighting is distributed in separate JAR. I like
minimal dependencies.


In both cases we would be open to contribution in different forms - just
contributing patches or with you to join us and the ASF as module or
subproject (the latter options have to undergo a formal vote by all
project
members, of course) as I strongly believe that merging efforts would bring
to a better and more complete product.
I would prefer  MapDB to stay on GitHub.  I find it more comfortable to use.
JDBM3 (previous version) nearly become ApacheDS subproject, but on last
moment I decided otherwise.
I strongly agree with Jan here. JDBM/MapDB is used by most people as a
DB or persistent map.
Its serialization functionality is nice to have, but not the most
important feature of it.
At the same time, for DM such things like off-heap mgmt and
serialization are the most important ones, but persistency is
optional.
Therefore, IMHO both project should remain independent and cooperate
or make use of each other. But they should not be integrated into one
"megaproject", which can do everything.

-Roman


Reply via email to