davisusanibar commented on code in PR #258: URL: https://github.com/apache/arrow-cookbook/pull/258#discussion_r995106822
########## java/source/dataset.rst: ########## @@ -213,6 +231,87 @@ Query Data Content For File 2 Gladis 3 Juan +Lets try to read a parquet gzip compressed file with 06 row groups: + +.. code-block:: + + $ parquet-tools meta data4_3rg_gzip.parquet + + file schema: schema + age: OPTIONAL INT64 R:0 D:1 + name: OPTIONAL BINARY L:STRING R:0 D:1 + row group 1: RC:4 TS:182 OFFSET:4 + row group 2: RC:4 TS:190 OFFSET:420 + row group 3: RC:3 TS:179 OFFSET:838 + +In this case, we are configuring ScanOptions batchSize argument equals to 20 rows, it's greater than +04 rows used on the file, then 04 rows is used on the program execution instead of 20 rows requested. Review Comment: Deleted ########## java/source/dataset.rst: ########## @@ -25,6 +25,24 @@ Dataset .. contents:: +Arrow Java Dataset offer native functionalities consuming native artifacts such as: Review Comment: Deleted ########## java/source/dataset.rst: ########## @@ -25,6 +25,24 @@ Dataset .. contents:: +Arrow Java Dataset offer native functionalities consuming native artifacts such as: + - JNI Arrow C++ Dataset: libarrow_dataset_jni (dylib/so): + To create C++ natively objects Schema, Dataset, Scanner and export that as a references (long id). + - JNI Arrow C Data Interface: libarrow_cdata_jni (dylib/so): + To get C++ Recordbacth. Review Comment: Deleted -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org