Dear all,

I am happy to announce the newest release of Rumble, the engine running JSONiq 
on Spark. JSONiq is XQuery's little brother that natively supports JSON-like 
data.

The 1.5.1 release contains many bugfixes, stability improvements, as well as:

- A growing list of input formats: now JSON (structured or semi-structured), 
Parquet, text, CSV, SVM, ROOT, and more on the way.

- Unified support for seamlessly reading and writing to the local file system, 
HDFS, S3, etc (the CLI arguments --query-path and --output-path as well as 
paths passed to input functions support any file system as long as the 
environment has the classes needed for the desired schemes).

- Many new builtin functions (XPath & XQuery 3.0 functions) are supported, 
i.e., our coverage of the standard continues to increase.

- Many more functions that used to force a materialization are now pushed down 
and executed in parallel (tail(), head(), etc).

- Navigation expressions are now faster if the data is highly structured (i.e., 
they automagically leverage Spark's dataframes, for example if the data was 
read from Parquet or CSV), but of course continue to work efficiently if the 
data is heterogeneous (semi-structured JSON). The user doesn't see the 
difference in JSONiq (data independence).

- More extensively tested on clusters such as Amazon EMR reading from and 
writing to S3.

- Compatibility with the latest Spark versions (2.4.x).

- And more hidden gems under development, to be announced later.

The release is free and open source (it is a 8MB jar that you can simply wget 
over to your laptop or cluster with Spark installed, ready to use).

http://rumbledb.org/

Many thanks to all our contributors, many of whom are students working on their 
projects or theses.

Enjoy!

Kind regards,
Ghislain


_______________________________________________
[email protected]
http://x-query.com/mailman/listinfo/talk

Reply via email to