Re: MapDB

Jan Kotek Wed, 07 Nov 2012 13:03:03 -0800

> As long as all get down to base types (no matter in what hierarchylayer) it'll work out of the box.

I think basic problem is what a 'base type' is. It is veryspace-unefficient to treat HashMap or Date as POJO.For MapDB serialization HashMap, Date (and many other common data types)is a base type.


Jan

On 07/11/12 20:40, Christoph Engelbert wrote:

Am 07.11.2012 21:30, schrieb Jan Kotek:

Hi Roman,

your patch saved me lot of headaches and was very welcomed.

I am not using Quickser becouse serialization in MapDB is still
evolving rapidly. For example I made some refactoring to make
serialization less dependant on other JDBM classes. I also have
plan to use some stuff from Kryo and Lighting (unsafe ops,
bytecode generators). And Quickser did not had much updates since
it forked.

Obviously it is more comfortable for me if serialization framework
stays inside MapDB. It is critical part since in database we care
about long term persistence. But on other side I would love if
somebody would took over this part. I have no problem with
extracting serialization to separate project, but I need to see
that this fork is active and can evolve on its own.

I hoped I could use Lightning, Kryo or other framework developed
as part of DirectMemory. But there seems to be conception
difference.  Kryo and Lightning seems to be more like
'serialization framework'; it has bunch of serializers (for
numbers, dates...) and you should choose one which suits you best.

But MapDB should 'just work' without additional configuration. So
I need universal serialization; it should turn any object into
bytes  (similar to Java Serialization or XStream). Also I want it
to mimic standard Java Serialization (Serializable marker
interface, Externalizable, writeExternal methods... etc).

Ok now I have time to get into this discussion :-) First I need to
say, it's nice to see that Lightning got some attention. It's always
nice to see if some of your baby grow up.

Lightning in general is a nearly complete approach of an serializer.
You can serialize a lot of classes by just tell it to take all
"attributes" in a class an serialize them. As long as all get down
to base types (no matter in what hierarchy layer) it'll work out of
the box.
When initializing the serializer all depending classes are analysed
and the bytecode marshallers are generated (or at least one of the
base marshallers is used).

There is no need for Externalizable or Serializable (but both can be
serialized) and there's another Lightning internal interface (for
the same usage as Externalizable) Streamed.

So for now I will investigate if I can patch Lighting to support
my needs. If not I will take parts I like and integrate it into
MapDB.

I'll love to see some help and give backup in investigation.

Jan

On 07/11/12 09:30, Roman Levenstein wrote:

Hi,

I'm one of the contributors to JDMB3 serialization implementation.
Actually earlier this year I made it much faster than before (2
orders
of magnitude). And BTW, I'm also a contributor to Kryo and
protostuff-runtime.

I find this discussion very interesting, so let me provide my two
cents as well.

First of all, I just want to mention that while working on improving
JDBM's serialization, I extracted the serialization part of the JDBM
into a dedicated serialization library, which I called Quickser. You
can find it on GitHub: https://github.com/romix/quickser
It is really very fast, often faster than Kryo and protostuff. Since
Quickser contains only serialization-related stuff from
JDBM/MapDB, it
is easier to use it if you just want to add yet another
serialization
method to DM without any DB related functionality.

It could even make sense, if MapDB would use Quickser for
serialization instead of having both DB and serialization related
functionality in one pot.

@Jan: What do you think about it? I understand that you don't like
external dependencies. But Quickser is not really external. It is
more
or less a copy of JDMBs serialization-related classes.

On Wed, Nov 7, 2012 at 9:49 AM, Jan Kotek <[email protected]> wrote:

Hi,

      1. DirectMemory could make good use of mapdb to serialize
least
      frequently used items to disk and free memory
      2. DirectMemory could implement a MapDB disk based store
in addition
to
      the bytebuffer and unsafe ones

The only problem may be that MapDB currently does not support
concurrent
transactions (it has only one single global transaction).
Not sure if it could be a problem.

However it implements ConcurrentMap, so it is possible to swap
items
atomically

      3. MapDB could take advantage of DM's componentization
approach to
      support multiple serializers (we believe each one has its
advantages
in
      different scenarios)

MapDB already supports alternative serializers. User can supply
their own on
Map (similar to table) creation.
I would love to integrate stuff from lightning serializer.

      4. MapDB could use DM to write items to an off-heap before
writing to
      disk (asynchronously) to improve speed

Not sure it would be practical. MapDB already uses memory mapped
files so
effect would be very similar. My tests shows that there is only 50%
performance difference between inMemory store and onDisk store.

Currently MapDB has only heap based inMemory store. But
implementing off
heap memory store is trivial and I will do it soon.

This is very nice to know. Looking forward to see this feature.
May be
you should use DM for it?

      5. We could merge our serialization efforts (I believe
lightning is
very  fast and worth to be considered) and provide an even
better solution
or two alternative implementations

100% agree. I will check lightning sources and see if I could
contribute my
stuff. MapDB serialization is very space-efficiency oriented and
it can
contribute a lot.

Well, having worked with JDBM's/MapDB's serialization, Kryo and
protostuff, I would say that MapDB's serialization is
space-efficient,
but roughly at the same level as Kryo or a bit worse than latest
versions of Kryo.

IMHO, the biggest advantage of MapDB's serialization is its
speed. It
usually wins against highly optimized versions of Kryo and
protostuff,
even though they use Unsafe tricks and the like. To some extent this
speed improvement  can be probably attributed to the  simplicity of
MapDB's serialization implementation. It is not very feature
rich, but
very small and simple (just a few classes) and call stacks during
serialization are usually also very short. Probably JIT is able to
optimize and inline much better than in other more complex and
universal frameworks.

My only condition is that lighting is distributed in separate
JAR. I like
minimal dependencies.

In both cases we would be open to contribution in different
forms - just
contributing patches or with you to join us and the ASF as
module or
subproject (the latter options have to undergo a formal vote by
all
project
members, of course) as I strongly believe that merging efforts
would bring
to a better and more complete product.

I would prefer  MapDB to stay on GitHub.  I find it more
comfortable to use.
JDBM3 (previous version) nearly become ApacheDS subproject, but
on last
moment I decided otherwise.

I strongly agree with Jan here. JDBM/MapDB is used by most people
as a
DB or persistent map.
Its serialization functionality is nice to have, but not the most
important feature of it.
At the same time, for DM such things like off-heap mgmt and
serialization are the most important ones, but persistency is
optional.
Therefore, IMHO both project should remain independent and cooperate
or make use of each other. But they should not be integrated into
one
"megaproject", which can do everything.

-Roman

Re: MapDB

Reply via email to