[
https://issues.apache.org/jira/browse/ARROW-7905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17104426#comment-17104426
]
Wes McKinney commented on ARROW-7905:
-------------------------------------
We can discuss more on the mailing list, but keep in mind that it didn't start
out that way, and only is this way as a shortest path to prevent certain
previous serialization steps (for example, converting {{arrow::BinaryArray}} to
{{ByteArray*}} on route to encoding) and to provide access to internal encoding
details (like the dictionary encoding). I would guess with a from-scratch
rewrite there might be some way around this through some substantial template
structures that serve the same ends (exposing encoding internals via ~zero cost
abstractions).
> [Go][Parquet] Port the C++ Parquet implementation to Go
> -------------------------------------------------------
>
> Key: ARROW-7905
> URL: https://issues.apache.org/jira/browse/ARROW-7905
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Go
> Reporter: Nick Poorman
> Assignee: Nick Poorman
> Priority: Minor
> Labels: Go, Parquet, golang
> Time Spent: 88.2h
> Remaining Estimate: 36h 23m
>
> I’m currently in the progress of porting the C++ version of Parquet in the
> Apache Arrow project to Golang. Many projects and companies have been and are
> building their data lakes and persistence layer using Parquet. Apache Spark
> uses it heavily for persistence (including Databricks DeltaLake).
> To me this is the missing component for people to truly begin using the Go
> implementation of Arrow with any existing data architectures.
> If you have any interest in this project, give this issue a watch as it will
> keep me motivated to finish the port. Also, if you have specific use cases
> feel free to drop them in here so I can keep them in mind as I continue with
> the port.
> Things with the code base are rather in flux at the moment as I figure out
> how to solve various nuances between the features of C++ and Go. As soon as I
> have a solid chunk of the port working, I’ll create a PR in the Apache Arrow
> project on Github and let everyone know in here.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)