Sounds good! Thanks for filing that issue. If you'd like to work on separating out the Hadoop code, I'd be happy to help review. It's something I've been meaning to do for a while.
On Fri, Jan 24, 2020 at 10:13 AM David Mollitor <[email protected]> wrote: > Thanks Ryan for the confirmation of my suspicions. > > That would certainly make a quick sample application easier to achieve > from an adoption perspective. > > I had just put this JIRA in. I'll leave it open for anyone to jump in on. > https://issues.apache.org/jira/browse/PARQUET-1776 > > Thanks, > David > > > On Fri, Jan 24, 2020 at 12:08 PM Ryan Blue <[email protected]> > wrote: > >> There's not currently a way to do this without Hadoop. We've been working >> on moving to the `InputFile` and `OutputFile` abstractions so that we can >> get rid of it, but Parquet still depends on Hadoop libraries for >> compression and we haven't pulled out the parts of Parquet that use the >> new >> abstraction from the older ones that accept Hadoop Paths, so you need to >> have Hadoop in your classpath either way. >> >> To get to where you can write a file without Hadoop dependencies, I think >> we need to create a new module that parquet-hadoop will depend on with the >> `InputFile`/`OutputFile` implementation. Then we would refactor the Hadoop >> classes to extend those implementations to avoid breaking the Hadoop >> classes. We'd also need to implement the compression API directly on top >> of >> aircompressor in this module. >> >> On Thu, Jan 23, 2020 at 4:40 PM David Mollitor <[email protected]> wrote: >> >> > I am usually a user of Parquet through Hive or Spark, but I wanted to >> sit >> > down and write my own small example application of using the library >> > directly. >> > >> > Is there some quick way that I can write a Parquet file to the local >> file >> > system using java.nio.Path (i.e., with no Hadoop dependencies?) >> > >> > Thanks! >> > >> >> >> -- >> Ryan Blue >> Software Engineer >> Netflix >> > -- Ryan Blue Software Engineer Netflix
