Re: [I] [OSPP] GraphAr Extension for Kuzu [incubator-graphar]

via GitHub Mon, 22 Sep 2025 12:09:07 -0700


adsharma commented on issue #679:
URL: 
https://github.com/apache/incubator-graphar/issues/679#issuecomment-3320914564

Thank you for explaining! My solution is motivated by trying to serve
wikidata (90 million nodes, 800+ million edges) from kuzu. The on-disk storage
requirements were unacceptable due to denormalization. I'm looking to serve
graphs 10x that size. So on-disk and selective loading is the main use case.

I also want to compete with LLMs in terms of graph compression and storage
efficiency by offloading some of the knowledge stored there into external
storage.

Parquet files as they stand now aren't sufficient, but a step in the right
direction.

I don't want to specify whether the edges should be sorted by type or by
graph structure. Depends on the use case. Want to support both well.

Kuzu folks have made a decision to support strongly typed nodes and edges.
But you can always store weakly typed graphs by merging them all into a "node"
table and a "rel" table.

If the parquet file is sorted, you can do predicate pushdowns. DuckDB and
Spark do it.

DuckDB native storage is also supported as an additional single file option.
Why? It has a few more [compression
tricks](https://duckdb.org/2025/09/08/duckdb-on-the-framework-laptop-13.html#tpc-h-sf10000)
and single file is more convenient. In the TPC-H SF10k example, parquet files
were 4TB, but duckdb was 2.7TB.

Kuzu has an extension to read from duckdb. But I'm not sure if it can handle
TB sized files and do efficient predicate pushdowns.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@graphar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@graphar.apache.org
For additional commands, e-mail: commits-h...@graphar.apache.org

Re: [I] [OSPP] GraphAr Extension for Kuzu [incubator-graphar]

Reply via email to