Awesome analysis/input Thad and great references.  Reading through
now.  The flagging of the lack of stream orientation to the library is
great.

Somewhat related we've pondered adding an annotation that can be
placed on processors so that those which are able to operate on input
and output streams without loading full objects in memory get some
visual flag/indicator in the UI.  Idea being it would help dataflow
managers to at least realize what they're doing can create memory
congestion/scalability issues.  What do you think of that idea?

On Fri, Apr 8, 2016 at 10:03 PM, Thad Guidry <[email protected]> wrote:
> Frank's work utilizes the Jolt spec(Apache 2 license), which is a great way
> to handle JsonToJson transforms in my opinion.
>
> Jolt is not a good fit for Process or Rules, (Use Groovy or Java, etc), but
> transforming Json in a great declarative way with Jolt beats the pants off
> of anything else out there. Although its not stream based, and can consume
> memory when your Json payload size is huge, like 300mb json files, etc, but
> fine for most Json payloads in the wild.
>
> "Two things to be aware of :
>
> Jolt is not "stream" based, so if you have a very large Json document to
> transform you need to have enough memory to hold it.
> The transform process will create and discard a lot of objects, so the
> garbage collector will have work to do.
> "
>
> A few more details about how it can be used are mentioned on its official
> page here:
> http://bazaarvoice.github.io/jolt/
>
> A demo of Jolt to see how you can transform Json to Json (click the
> Transform button):
> http://jolt-demo.appspot.com/#ritwickgupta
>
> Here's the rough performance of Jolt in 2013 where an 80k json file is
> shifted in about 5 secs. (authors notes on this slide are interesting), :
> https://docs.google.com/presentation/d/1sAiuiFC4Lzz4-064sg1p8EQt2ev0o442MfEbvrpD1ls/edit#slide=id.g9ac79e71_01
>
> Thad
> +ThadGuidry

Reply via email to