More information can be found via JIRA :-)

State management for operators:

https://issues.apache.org/jira/browse/APEXMALHAR-1897

Take for example a join operator that may accumulate a large amount of data
for a window before emitting results, more than the available memory. With
the embedded store key/value data can be stored in HDFS, updated
incrementally and brought back when needed. There is more work in progress
to build data structures on top of it.

For the new operators, besides JIRA here is the source code (AFAIK separate
documentation yet to be added):

https://github.com/apache/incubator-apex-malhar/tree/master/contrib/src/main/java/com/datatorrent/contrib/enrich
https://github.com/apache/incubator-apex-malhar/tree/master/library/src/main/java/com/datatorrent/lib/transform
https://github.com/apache/incubator-apex-malhar/tree/master/library/src/main/java/com/datatorrent/lib/projection

Thomas








On Wed, May 25, 2016 at 3:07 PM, Himanshu Bari <himanshub...@gmail.com>
wrote:

> Good work!
>
> Where can we find more information on
> - Large operator state management (embedded key/value storage)
> - New operators for transform, projection, enrichment
>
> On Wed, May 25, 2016 at 8:50 AM, Thomas Weise <t...@apache.org> wrote:
>
> > Dear Community,
> >
> > The Apache Apex community is pleased to announce release 3.4.0 of the
> > Malhar library.
> >
> > This release follows release 3.4.0 of core, resolves 66 JIRAs and adds
> > a number of exciting new features and enhancements, including:
> >
> > - First cut of the high level Java stream API
> > - Large operator state management (embedded key/value storage)
> > - Connectors for Apache NiFi
> > - Connectors and checkpointing with Apache Geode
> > - New operators for transform, projection, enrichment
> > - Support for Avro and Parquet formats
> >
> > Changes: https://s.apache.org/Gc1d
> >
> > Apache Apex is an enterprise grade native YARN big data-in-motion
> platform
> > that unifies stream and batch processing. Apex was built for
> > scalability and low-latency processing, high availability and
> operability.
> >
> > Apache Apex is Java based and strives to ease application development on
> a
> > platform that takes care of aspects such as stateful fault tolerance,
> > partitioning, processing guarantees, buffering and synchronization,
> > auto-scaling etc. Apex comes with Malhar, a rich library of pre-built
> > operators, including adapters that integrate with existing technologies
> as
> > sources and destinations, like message buses, databases, files or social
> > media feeds.
> >
> > The source release can be found at:
> >
> > http://www.apache.org/dyn/closer.lua/apex/apache-apex-malhar-3.4.0/
> >
> > or visit:
> >
> > http://apex.apache.org/downloads.html
> >
> > We welcome your help and feedback. For more information on the project
> and
> > how to get involved, visit our website at:
> >
> > http://apex.apache.org/
> >
> > Regards,
> > The Apache Apex community
> >
>

Reply via email to