zeroshade commented on issue #252:
URL: https://github.com/apache/arrow-go/issues/252#issuecomment-2596257760

   It honestly depends. Both are released under the Apache License 2.0, so 
licensing isn't going to be an issue or differ between them.
   
   Ultimately it's going to be a question of Features, Performance and 
Maintenance.
   
   ### Features
   
   For example, if you need to support the parquet encryption capabilities, 
then you should use this library apache/arrow-go/v18/parquet as 
`parquet-go/parquet-go` doesn't support the encryption functionality to my 
knowledge.
   
   If you need to leverage bloom filters, then you'll need to use 
`parquet-go/parquet-go` instead of this library as we don't have support for 
bloom filters yet (though I am currently working on that!).
   
   If you are already leveraging Apache Arrow itself for anything (ADBC for 
database interaction, Flight/FlightSQL for wire protocol, interacting with 
DuckDB or other Arrow-compatible/native compute engines, etc), then this 
library is going to be more performant and beneficial due to the direct 
integration it has with Arrow through the `pqarrow` package. If your data is 
laid out in Go structs, then `parquet-go/parquet-go` has simpler API as I 
haven't had the time to enable writers to accept Go structs for writing yet.
   
   I have plans on improving the public APIs for the writers of this library to 
better utilize generics, while `parquet-go` already has such APIs that utilize 
generics. And so on. 
   
   ### Performance
   
   As far as Performance and memory usage, I haven't benchmarked anything 
significant between the two libraries so I can't speak to any comparison there. 
On this, I invite you to perform comparisons with your use case. That said, if 
you do find that `parquet-go` is more performant or has better memory usage 
than this library, please come back and let me know! I'd love to attempt to 
address any performance/memory usage issues that you come across as great pains 
have been taken to optimize this library as much as possible (just as 
`parquet-go` has, with different solutions to some things)
   
   ### Maintenance
   
   Both projects are actively maintained looking at the frequency of commits. 
While I can't speak to the maintainers of `parquet-go` I can say that this 
project is highly connected to the Parquet PMC and Apache community as far as 
keeping up with any changes to the Parquet format, having input into the wider 
community, and so on. That said, technically this library is considered the 
*official* Go library for Parquet.
   
   
   I hope the above helps you make a decision, or at least gives you a 
direction for exploration. In the end, I'm interested in what you end up going 
with and why. Particularly, if you do end up going with `parquet-go` over this 
library, I would like to know why you make that decision so I can address those 
issues and make this library better and more desireable :smile:
   
   Honestly, thanks for filing this issue!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to