Re: [Jprogramming] Report on the J wiki meeting of January 27, 2022

Ric Sherlock Wed, 02 Feb 2022 01:02:49 -0800

Thanks Stefan,
Had a quick look and liked the sound of being able to work with
larger-than-memory datasets by analysing streams from data sources, however
wasn't so impressed by the DuckDB showing (out-of-memory) on this benchmark
site https://h2oai.github.io/db-benchmark/ However it looks like the
benchmark is using CSV not Parquet/Arrow so that may make a difference.
I read/write from Parquet currently using Polars and am pretty happy with
the performance. What I'm missing currently is reading/writing to Parquet
from J.
I see there is a separate post which may help in that regard!



On Wed, Feb 2, 2022 at 8:27 PM Stefan Baumann <[email protected]> wrote:

> Ric, You might want to check out DuckDB (https://duckdb.org/), I recently
> used it for reading and writing Parquet files.
> It's similar to SQLite but intended to be used for analytics.
> Stefan.
>
> On Wed, Feb 2, 2022 at 5:29 AM Ric Sherlock <[email protected]> wrote:
>
> > I spend a fair bit of time wrangling data formatted as C structs, CSV and
> > am trying to move more to Parquet as a file storage format.
> > I've also had on my list to investigate what would be involved in
> > reading/writing Parquet from J. Do you know if anyone out there has
> looked
> > into this?
> > Ric
> >
> >
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Report on the J wiki meeting of January 27, 2022

Reply via email to