@xhochy I stand awaiting instructions about how to proceed with the merge. My preference is to squash the history and, while we are at it, move the data files to http://github.com/apache/arrow-testing.
According to https://stackoverflow.com/a/15802324 through some trickery we can graft the src/parquet commit history onto the Arrow repo to be able to do an `--ff-merge`. If we are collectively OK with the idea of pulling ~400 commits into the Arrow repo then we could do this. If we pull in the commit history with `filter-branch`, one question is what to do with the early "unclean" commits (no JIRAs) I can't claim to speak for @nongli but I suspect he wouldn't mind if we squashed these pre-Apache commits as the seed of the project. ``` * 592cf71 2015-04-02 | PARQUET-232: minor compilation issue [Fabrizio Fabbri] * db20bae 2014-09-08 | Add "parquet_reader.cc" in the folder "example". [Yue Chen] * bce89a0 2014-10-28 | PARQUET-120: Copy dev scripts and readme from parquet-mr. [Nong Li] * f23be56 2014-06-04 | Fix broken download_thirdparty script. [Nong Li] * 153f92b 2014-06-03 | Add apache license. [Nong Li] * 09d3c74 2014-06-02 | Switch to int64. [Nong Li] * bfe23da 2014-06-02 | Add lz4 codec. [Nong Li] * aa09914 2014-06-02 | Added a quick snappy & plain benchmark. [Nong Li] * 143a868 2014-05-31 | Add thrift generated sources. [Nong Li] * df3c37a 2014-05-31 | Implement snappy decompression. [Nong Li] * dedad16 2014-05-31 | Abstraction for compression. [Nong Li] * 0da0397 2014-05-31 | Move encoding to own folder. [Nong Li] * 9a9125f 2014-05-31 | Add a plain encoded test file and some misc documentation. [Nong Li] * e8be46d 2014-05-31 | Updated readme. [Nong Li] * ef57939 2014-05-31 | Fix off by one in delta bit pack encoding. [Nong Li] * 48591fd 2014-05-30 | fix core-dump while reading PLAIN_ENCODING; fix readNewPage to return page instead of the last one [Jaguar Xiong] * 846e6b4 2014-05-27 | Fix max encoded size estimate in RLE encoding. [Nong Li] * 46f5b52 2014-05-23 | Update readme. [Nong Li] * 104373f 2014-05-23 | Move encodings to separate files. [Nong Li] * 078bb78 2014-05-23 | Implement delta byte length and delta string encodings. [Nong Li] * e214cb4 2014-05-22 | Fix delta binary encoding. Decodes at about ~200M/sec. [Nong Li] * c656554 2014-05-22 | Cleanup bitpacking encoding. Add delta length encoding. [Nong Li] * 992f187 2014-05-20 | Initial implementation of delta binary packing. [Nong Li] * 722c751 2014-05-19 | Added simple decode benchmark. [Nong Li] * 0959d2a 2014-05-19 | Inline some simple functions. [Nong Li] * c23d93a 2014-05-19 | Change decoders to a batched API. [Nong Li] * bc84379 2014-05-18 | Remove makefile. [Nong Li] * 0ac13db 2014-05-18 | Update readme. [Nong Li] * c19c5a7 2014-05-18 | Plumb through all the types. [Nong Li] * 060c0d7 2014-05-18 | Better decoder management. [Nong Li] * 9f1d702 2014-05-17 | Fix stream abstraction. [Nong Li] * 674a392 2014-05-13 | Read a dictionary encoded int column. [Nong Li] * f979b15 2014-05-13 | Implement some encodings. [Nong Li] * f1b987e 2014-05-13 | Include some Impala utility code. [Nong Li] * 9b42064 2014-05-13 | Can read the dictionary for ints. [Nong Li] * eb10a21 2014-05-12 | Move build to cmake. [Nong Li] * 01c30aa 2014-05-12 | Initial classes. [Nong Li] * 08acdf6 2014-05-12 | Initial commit [Nong Li] ``` [ Full content available at: https://github.com/apache/arrow/pull/2453 ] This message was relayed via gitbox.apache.org for [email protected]
