SemyonSinchenko commented on issue #756: URL: https://github.com/apache/incubator-graphar/issues/756#issuecomment-3297328060
> I have some questions: > Q1: Is the I/O granularity at the chunk level? > For example, in `java-io-parquet`, is each I/O operation reading one entire chunk, and then filtering and computation within the chunk are done in memory? In that case we lost all the benfits of GAR (except better compression). Because GAR data is sorted by IDs and we have also index table, we can befit of it doing pushdows and in case, for example, user wants to read only 2-hop neighborhoud of small subset of IDs in the Graph, we must check first index table and min-max statistics per column stored in headers of parquet files to skip most of chunks. Reading and filtering in memory sounds crazy for me, like why? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@graphar.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@graphar.apache.org For additional commands, e-mail: commits-h...@graphar.apache.org