daniel-adam-tfs commented on code in PR #477:
URL: https://github.com/apache/arrow-go/pull/477#discussion_r2284898520


##########
parquet/file/file_reader_test.go:
##########
@@ -927,3 +928,61 @@ func TestListColumns(t *testing.T) {
                }
        }
 }
+
+func BenchmarkReadInt32Column(b *testing.B) {
+       b.Skip("rle-dict-int32-snappy.parquet not available")
+
+       dir := os.Getenv("PARQUET_TEST_DATA")
+       if dir == "" {
+               dir = "../../parquet-testing/data"
+               b.Log("PARQUET_TEST_DATA not set, using 
../../parquet-testing/data")
+       }
+       require.DirExists(b, dir)
+
+       filePath := filepath.Join(dir, "rle-dict-int32-snappy.parquet")
+       reader, err := file.OpenParquetFile(filePath, false)
+       if err != nil {
+               b.Fatalf("Expected no error while opening parquet file %q, got 
%v", filePath, err)
+       }
+       defer reader.Close()
+
+       int32ColIdx := reader.MetaData().Schema.Root().FieldIndexByName("int32")
+       if int32ColIdx < 0 {
+               b.Fatalf("Expected to find int32 column in schema, got index 
%d", int32ColIdx)
+       }
+
+       numValues := reader.NumRows()
+       values := make([]int32, numValues)
+       b.StopTimer()

Review Comment:
   I definitely shouldn't have 3 calls to `StopTimer` in this benchmark, I 
think I forgot to remove the one on 961. 
   
   The timer is in `On` state as this benchmark runs and `ResetTimer` doesn't 
stop it. I want to measure just the `ReadBatch` call, so I stop it to avoid 
including execution of lines 958-971 into the benchmark timing. Or do you think 
we should include it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to