Re: [zeromq-dev] BSON as high performance serialisation

2012-03-31 Thread Marten Feldtmann
What I find really interesting is the TNetString approach Mongrel2 is 
using instead of json.

Marten

Am 31.03.2012 03:23, schrieb Steven McCoy:
 Interesting tidbit from a YouTube presentation,

 *Serialization formats* - no matter which one you use, they are all
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] BSON as high performance serialisation

2012-03-31 Thread Wolfgang Richter
The only issue with BSON is that it's not entirely generic---the spec
has types that are specific to MongoDB (like DBPointer, stuff to ship
JavaScript code with context, etc.).

This comes from my experience implementing BSON in C.  And, yes, I'd
believe it could be fast (maybe not with string-based keys though?).

I feel like the specification could be lighter, a light BSON would
be ideal for a lot of applications I think.

In fact, a light BSON paired with 0MQ is basically what I'm
currently working on in a project :-)

--
Wolf

PS If anyone would like my C BSON implementation, I've been thinking
about releasing it under the MIT License, feel free to ask me.

On Sat, Mar 31, 2012 at 3:57 AM, Marten Feldtmann
itli...@schrievkrom.de wrote:
 What I find really interesting is the TNetString approach Mongrel2 is
 using instead of json.

 Marten

 Am 31.03.2012 03:23, schrieb Steven McCoy:
 Interesting tidbit from a YouTube presentation,

     *Serialization formats* - no matter which one you use, they are all
 ___
 zeromq-dev mailing list
 zeromq-dev@lists.zeromq.org
 http://lists.zeromq.org/mailman/listinfo/zeromq-dev
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] BSON as high performance serialisation

2012-03-31 Thread Rick Olson
How's BSON compare to msgpack?  I've started using that in places.
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] BSON as high performance serialisation

2012-03-31 Thread Wolfgang Richter
On Sat, Mar 31, 2012 at 5:33 PM, Rick Olson technowee...@gmail.com wrote:
 How's BSON compare to msgpack?  I've started using that in places.

In my mind, it seems like the performance of BSON and msgpack could be
comparable.

msgpack's specification is more generic than BSON's (no
MongoDB-specifics), and it seems to be a bit more well specified.  In
addition, msgpack doesn't require a string 'key' per message, and it's
format seems to be more compact (space-efficient) than BSON.  This
might imply quicker encoding/decoding, although that could also be
implementation-specific.

msgpack looks really nice :-)

Both seem simple enough to implement on your own (no external
dependencies introduced which can be nice).

However, msgpack seems nice because an ecosystem of software including
RPC is built around it (although reinventing the communication layers
which could be managed by 0MQ...).

BSON's canonical implementation is the one included in MongoDB.

I'd expect things like msgpack to win in the long run (unless the
BSON spec is changed to be simpler+more generic; it's already simpler
than msgpack though) because they are more generic.

 ___
 zeromq-dev mailing list
 zeromq-dev@lists.zeromq.org
 http://lists.zeromq.org/mailman/listinfo/zeromq-dev
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] BSON as high performance serialisation

2012-03-31 Thread Steven McCoy
On 31 March 2012 17:33, Rick Olson technowee...@gmail.com wrote:

 How's BSON compare to msgpack?  I've started using that in places.


Not to dissuade from MsgPack having a more convenient API to use,
but MsgPack is surprisingly worse than Protocol Buffers.  Despite their
website claims, unfortunately the MsgPack projects testing procedure is
flawed.  This has previously been raised on the list.

After looking at http://bsonspec.org/#/specification I'm not sure how
YouTube is finding BSON to be faster.  It looks like a 1st generation
format like TIBCO's forms and not second generation qforms (using a
dictionary), or third generation rforms (using dynamic dictionaries).

More development appears to be towards convenience as hardware improvements
make bit tweaking less productive.

-- 
Steve-o
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] BSON as high performance serialisation

2012-03-31 Thread Lourens Naudé
I think it's more difficult to draw comparisons when one factors in the
binding ecosystem as well - a large part of the community use libzmq
through some higher level binding. Most serialization wrappers tend to
create additional heap cruft that stresses the GC in some languages. Here's
an interesting case study :

* Deets :
http://www.ohler.com/software/thoughts/Blog/Entries/2012/3/13_Need_for_Speed.html
* Implementation : https://github.com/ohler55/oj

So in summary, watch out for edges where micro benches of an implementation
is fast, yet introduce a hidden cost in GC pressure relative to message
volume which can destroy any soft realtime guarantees and overall system
throughput / performance.

I also think the topic should perhaps be taken off the list since libzmq
does not impose message structure, BUT it's also important to keep tabs on
this ( recommendations or real production feedback etc. ) somewhere on the
wiki or docs for reference.

- Lourens

On Sat, Mar 31, 2012 at 11:27 PM, Wolfgang Richter w...@cs.cmu.edu wrote:

 On Sat, Mar 31, 2012 at 5:33 PM, Rick Olson technowee...@gmail.com
 wrote:
  How's BSON compare to msgpack?  I've started using that in places.

 In my mind, it seems like the performance of BSON and msgpack could be
 comparable.

 msgpack's specification is more generic than BSON's (no
 MongoDB-specifics), and it seems to be a bit more well specified.  In
 addition, msgpack doesn't require a string 'key' per message, and it's
 format seems to be more compact (space-efficient) than BSON.  This
 might imply quicker encoding/decoding, although that could also be
 implementation-specific.

 msgpack looks really nice :-)

 Both seem simple enough to implement on your own (no external
 dependencies introduced which can be nice).

 However, msgpack seems nice because an ecosystem of software including
 RPC is built around it (although reinventing the communication layers
 which could be managed by 0MQ...).

 BSON's canonical implementation is the one included in MongoDB.

 I'd expect things like msgpack to win in the long run (unless the
 BSON spec is changed to be simpler+more generic; it's already simpler
 than msgpack though) because they are more generic.

  ___
  zeromq-dev mailing list
  zeromq-dev@lists.zeromq.org
  http://lists.zeromq.org/mailman/listinfo/zeromq-dev
 ___
 zeromq-dev mailing list
 zeromq-dev@lists.zeromq.org
 http://lists.zeromq.org/mailman/listinfo/zeromq-dev

___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] BSON as high performance serialisation

2012-03-31 Thread Wolfgang Richter
On Sat, Mar 31, 2012 at 6:40 PM, Steven McCoy steven.mc...@miru.hk wrote:
 On 31 March 2012 17:33, Rick Olson technowee...@gmail.com wrote:

 How's BSON compare to msgpack?  I've started using that in places.


 Not to dissuade from MsgPack having a more convenient API to use,
 but MsgPack is surprisingly worse than Protocol Buffers.  Despite their
 website claims, unfortunately the MsgPack projects testing procedure is
 flawed.  This has previously been raised on the list.

Right, although this is implementation-specific.


 After looking at http://bsonspec.org/#/specification I'm not sure how
 YouTube is finding BSON to be faster.  It looks like a 1st generation format
 like TIBCO's forms and not second generation qforms (using a dictionary), or
 third generation rforms (using dynamic dictionaries).

I agree, with string-based keys and space-inefficiency I'm wondering a bit too.

However, I think the key is in their custom BSON implementation.

I highly doubt YouTube uses vanilla BSON.

I think it is more likely that, just as BSON was inspired by JSON
(but not just a simple extension), I think YouTube is using something
inspired by BSON, but optimized for their use cases, and implemented
accordingly.

Note they claim implementation which is 10-15 time faster than the
one you can download, which to me means they have a custom
_implementation_ of BSON which might differ from spec, that is faster
than the MongoDB canonical implementation.
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] BSON as high performance serialisation

2012-03-31 Thread Wolfgang Richter
 I also think the topic should perhaps be taken off the list since libzmq
 does not impose message structure, BUT it's also important to keep tabs on
 this ( recommendations or real production feedback etc. ) somewhere on the
 wiki or docs for reference.



True, although every now and then questions crop up regarding how to
send datastructures via 0MQ across languages etc.

I think the documentation references ProtoBufs (FAQ does:
http://www.zeromq.org/area:faq), maybe we should add a list of
alternatives for people to look at (this is at least interesting to
some in the community)?

If 0MQ is trying its utmost to be performant in pushing messages, it
would be nice to be paired with a performant
serialization/deserialization solution.

YouTube reports that ProtoBufs is not so performant, which I guess
started this thread (and maybe many people pair ProtoBufs with
0MQ...).

--
Wolf
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] BSON as high performance serialisation

2012-03-31 Thread Wolfgang Richter
Updated the FAQ to reflect the fact that choice of serialization
format/library isn't simple, and there are multiple solutions.

If you want to see the diff/what I added (feel free to add more),
check the history and compare revisions 124 and 125:

http://www.zeromq.org/area:faq

--
Wolf

On Sat, Mar 31, 2012 at 6:55 PM, Wolfgang Richter w...@cs.cmu.edu wrote:
 I also think the topic should perhaps be taken off the list since libzmq
 does not impose message structure, BUT it's also important to keep tabs on
 this ( recommendations or real production feedback etc. ) somewhere on the
 wiki or docs for reference.



 True, although every now and then questions crop up regarding how to
 send datastructures via 0MQ across languages etc.

 I think the documentation references ProtoBufs (FAQ does:
 http://www.zeromq.org/area:faq), maybe we should add a list of
 alternatives for people to look at (this is at least interesting to
 some in the community)?

 If 0MQ is trying its utmost to be performant in pushing messages, it
 would be nice to be paired with a performant
 serialization/deserialization solution.

 YouTube reports that ProtoBufs is not so performant, which I guess
 started this thread (and maybe many people pair ProtoBufs with
 0MQ...).

 --
 Wolf
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] BSON as high performance serialisation

2012-03-31 Thread Pieter Hintjens
On Sat, Mar 31, 2012 at 6:35 PM, Wolfgang Richter w...@cs.cmu.edu wrote:

 I think the documentation references ProtoBufs (FAQ does:
 http://www.zeromq.org/area:faq), maybe we should add a list of
 alternatives for people to look at (this is at least interesting to
 some in the community)?

Yes, it's a common question, good to collect useful answers.

-Pieter
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] BSON as high performance serialisation

2012-03-31 Thread Justin Karneges
ASN.1/BER/DER?

*ducks*

On Saturday, March 31, 2012 04:42:58 PM Pieter Hintjens wrote:
 On Sat, Mar 31, 2012 at 6:35 PM, Wolfgang Richter w...@cs.cmu.edu wrote:
  I think the documentation references ProtoBufs (FAQ does:
  http://www.zeromq.org/area:faq), maybe we should add a list of
  alternatives for people to look at (this is at least interesting to
  some in the community)?
 
 Yes, it's a common question, good to collect useful answers.
 
 -Pieter
 ___
 zeromq-dev mailing list
 zeromq-dev@lists.zeromq.org
 http://lists.zeromq.org/mailman/listinfo/zeromq-dev
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


[zeromq-dev] BSON as high performance serialisation

2012-03-30 Thread Steven McCoy
Interesting tidbit from a YouTube presentation,

*Serialization formats* - no matter which one you use, they are all
 expensive. Measure. Don’t use pickle. Not a good choice. Found protocol
 buffers slow. They wrote their own BSON implementation which is 10-15 time
 faster than the one you can download.


http://highscalability.com/blog/2012/3/26/7-years-of-youtube-scalability-lessons-in-30-minutes.html

BSON is an initialism for Binary-JSON,

http://en.wikipedia.org/wiki/BSON

-- 
Steve-o
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev