[ 
https://issues.apache.org/jira/browse/BEAM-14304?focusedWorklogId=762692&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762692
 ]

ASF GitHub Bot logged work on BEAM-14304:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 27/Apr/22 04:05
            Start Date: 27/Apr/22 04:05
    Worklog Time Spent: 10m 
      Work Description: lostluck commented on code in PR #17347:
URL: https://github.com/apache/beam/pull/17347#discussion_r859358611


##########
sdks/go/pkg/beam/io/parquetio/parquetio.go:
##########
@@ -118,6 +122,19 @@ func (a *parquetReadFn) ProcessElement(ctx 
context.Context, filename string, emi
        return nil
 }
 
+// Write writes a PCollection<parquetStruct> to .parquet file.
+// Write expects a type t of struct with parquet tags
+// For example:
+// type Student struct {
+//   Name    string  `parquet:"name=name, type=BYTE_ARRAY, convertedtype=UTF8, 
encoding=PLAIN_DICTIONARY"`
+//   Age     int32   `parquet:"name=age, type=INT32, encoding=PLAIN"`
+//   Id      int64   `parquet:"name=id, type=INT64"`
+//   Weight  float32 `parquet:"name=weight, type=FLOAT"`
+//   Sex     bool    `parquet:"name=sex, type=BOOLEAN"`
+//   Day     int32   `parquet:"name=day, type=INT32, convertedtype=DATE"`
+//   Ignored int32   //without parquet tag and won't write
+// }
+

Review Comment:
   ```suggestion
   // }
   ```





Issue Time Tracking
-------------------

    Worklog Id:     (was: 762692)
    Time Spent: 1h 10m  (was: 1h)

> Implement parquetio for Go SDK
> ------------------------------
>
>                 Key: BEAM-14304
>                 URL: https://issues.apache.org/jira/browse/BEAM-14304
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-go
>            Reporter: Nguyen Khoi Nguyen
>            Priority: P2
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The naive approach would be reading the whole parquet file into memory, 
> because processing parquet files requires io.Seeker
> Or implement filesystem.go Interface to return io.ReadSeekCloser, but it 
> would not be trivial for gcs



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to