@xhochy I stand awaiting instructions about how to proceed with the merge. My 
preference is to squash the history and, while we are at it, move the data 
files to http://github.com/apache/arrow-testing.

According to https://stackoverflow.com/a/15802324 through some trickery we can 
graft the src/parquet commit history onto the Arrow repo to be able to do an 
`--ff-merge`. If we are collectively OK with the idea of pulling ~400 commits 
into the Arrow repo then we could do this.

If we pull in the commit history with `filter-branch`, one question is what to 
do with the early "unclean" commits (no JIRAs) 

I can't claim to speak for @nongli but I suspect he wouldn't mind if we 
squashed these pre-Apache commits as the seed of the project. 

```
* 592cf71 2015-04-02 | PARQUET-232: minor compilation issue [Fabrizio Fabbri]
* db20bae 2014-09-08 | Add "parquet_reader.cc" in the folder "example". [Yue 
Chen]
* bce89a0 2014-10-28 | PARQUET-120: Copy dev scripts and readme from 
parquet-mr. [Nong Li]
* f23be56 2014-06-04 | Fix broken download_thirdparty script. [Nong Li]
* 153f92b 2014-06-03 | Add apache license. [Nong Li]
* 09d3c74 2014-06-02 | Switch to int64. [Nong Li]
* bfe23da 2014-06-02 | Add lz4 codec. [Nong Li]
* aa09914 2014-06-02 | Added a quick snappy & plain benchmark. [Nong Li]
* 143a868 2014-05-31 | Add thrift generated sources. [Nong Li]
* df3c37a 2014-05-31 | Implement snappy decompression. [Nong Li]
* dedad16 2014-05-31 | Abstraction for compression. [Nong Li]
* 0da0397 2014-05-31 | Move encoding to own folder. [Nong Li]
* 9a9125f 2014-05-31 | Add a plain encoded test file and some misc 
documentation. [Nong Li]
* e8be46d 2014-05-31 | Updated readme. [Nong Li]
* ef57939 2014-05-31 | Fix off by one in delta bit pack encoding. [Nong Li]
* 48591fd 2014-05-30 | fix core-dump while reading PLAIN_ENCODING; fix 
readNewPage to return page instead of the last one [Jaguar Xiong]
* 846e6b4 2014-05-27 | Fix max encoded size estimate in RLE encoding. [Nong Li]
* 46f5b52 2014-05-23 | Update readme. [Nong Li]
* 104373f 2014-05-23 | Move encodings to separate files. [Nong Li]
* 078bb78 2014-05-23 | Implement delta byte length and delta string encodings. 
[Nong Li]
* e214cb4 2014-05-22 | Fix delta binary encoding. Decodes at about ~200M/sec. 
[Nong Li]
* c656554 2014-05-22 | Cleanup bitpacking encoding. Add delta length encoding. 
[Nong Li]
* 992f187 2014-05-20 | Initial implementation of delta binary packing. [Nong Li]
* 722c751 2014-05-19 | Added simple decode benchmark. [Nong Li]
* 0959d2a 2014-05-19 | Inline some simple functions. [Nong Li]
* c23d93a 2014-05-19 | Change decoders to a batched API. [Nong Li]
* bc84379 2014-05-18 | Remove makefile. [Nong Li]
* 0ac13db 2014-05-18 | Update readme. [Nong Li]
* c19c5a7 2014-05-18 | Plumb through all the types. [Nong Li]
* 060c0d7 2014-05-18 | Better decoder management. [Nong Li]
* 9f1d702 2014-05-17 | Fix stream abstraction. [Nong Li]
* 674a392 2014-05-13 | Read a dictionary encoded int column. [Nong Li]
* f979b15 2014-05-13 | Implement some encodings. [Nong Li]
* f1b987e 2014-05-13 | Include some Impala utility code. [Nong Li]
* 9b42064 2014-05-13 | Can read the dictionary for ints. [Nong Li]
* eb10a21 2014-05-12 | Move build to cmake. [Nong Li]
* 01c30aa 2014-05-12 | Initial classes. [Nong Li]
* 08acdf6 2014-05-12 | Initial commit [Nong Li]
```

[ Full content available at: https://github.com/apache/arrow/pull/2453 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to