Re: Travis CI has been failing

2016-08-29 Thread Liwei Lin
It's working now -- thanks Wes, Julien for the prompt fix!

On Tue, Aug 30, 2016 at 2:02 AM, Julien Le Dem  wrote:

> https://github.com/apache/parquet-mr/pull/364
>
> On Mon, Aug 29, 2016 at 4:41 AM, Wes McKinney  wrote:
>
> > Since googlecode project hosting seems to have completely shut down
> > (they had claimed that these downloads would be available the "rest of
> > 2016"), you can use the download links from GitHub:
> >
> > https://github.com/google/protobuf/releases/download/v2.
> > 5.0/protobuf-2.5.0.tar.bz2
> > cf https://github.com/google/protobuf/releases/tag/v2.5.0
> >
> > - Wes
> >
> > On Mon, Aug 29, 2016 at 6:32 AM, Liwei Lin  wrote:
> > > Hi,
> > >
> > > Traivs CI has been failing since two or three days ago; log shows that:
> > >
> > > The command "wget http://protobuf.googlecode.
> > com/files/protobuf-2.5.0.tar.gz"
> > > failed and exited with 8 during .
> > >
> > > (please refer to https://travis-ci.org/apache/
> parquet-mr/jobs/155839958)
> > >
> > > It'd be great if we could get Traivs up, thanks!
> >
>
>
>
> --
> Julien
>


[jira] [Resolved] (PARQUET-696) Move travis download from google code (defunct) to github

2016-08-29 Thread Julien Le Dem (JIRA)

 [ 
https://issues.apache.org/jira/browse/PARQUET-696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Le Dem resolved PARQUET-696.
---
   Resolution: Fixed
Fix Version/s: 1.9.0

Issue resolved by pull request 364
[https://github.com/apache/parquet-mr/pull/364]

> Move travis download from google code (defunct) to github
> -
>
> Key: PARQUET-696
> URL: https://issues.apache.org/jira/browse/PARQUET-696
> Project: Parquet
>  Issue Type: Task
>  Components: parquet-mr
>Reporter: Julien Le Dem
>Assignee: Julien Le Dem
> Fix For: 1.9.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PARQUET-696) Move travis download from google code (defunct) to github

2016-08-29 Thread Julien Le Dem (JIRA)
Julien Le Dem created PARQUET-696:
-

 Summary: Move travis download from google code (defunct) to github
 Key: PARQUET-696
 URL: https://issues.apache.org/jira/browse/PARQUET-696
 Project: Parquet
  Issue Type: Task
  Components: parquet-mr
Reporter: Julien Le Dem
Assignee: Julien Le Dem






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Travis CI has been failing

2016-08-29 Thread Julien Le Dem
https://github.com/apache/parquet-mr/pull/364

On Mon, Aug 29, 2016 at 4:41 AM, Wes McKinney  wrote:

> Since googlecode project hosting seems to have completely shut down
> (they had claimed that these downloads would be available the "rest of
> 2016"), you can use the download links from GitHub:
>
> https://github.com/google/protobuf/releases/download/v2.
> 5.0/protobuf-2.5.0.tar.bz2
> cf https://github.com/google/protobuf/releases/tag/v2.5.0
>
> - Wes
>
> On Mon, Aug 29, 2016 at 6:32 AM, Liwei Lin  wrote:
> > Hi,
> >
> > Traivs CI has been failing since two or three days ago; log shows that:
> >
> > The command "wget http://protobuf.googlecode.
> com/files/protobuf-2.5.0.tar.gz"
> > failed and exited with 8 during .
> >
> > (please refer to https://travis-ci.org/apache/parquet-mr/jobs/155839958)
> >
> > It'd be great if we could get Traivs up, thanks!
>



-- 
Julien


[jira] [Commented] (PARQUET-684) [C++] Hardware optimizations for dictionary / RLE encoding/decoding

2016-08-29 Thread Daniel Lemire (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15445971#comment-15445971
 ] 

Daniel Lemire commented on PARQUET-684:
---

Relevant blog post: 

https://github.com/lemire/dictionary

> [C++] Hardware optimizations for dictionary / RLE encoding/decoding
> ---
>
> Key: PARQUET-684
> URL: https://issues.apache.org/jira/browse/PARQUET-684
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-cpp
>Reporter: Wes McKinney
>
> See discussion in 
> https://github.com/apache/parquet-cpp/pull/140
> and experiments from Daniel Lemire in 
> https://github.com/lemire/dictionary



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PARQUET-695) C++: Better default encoding user experience

2016-08-29 Thread Uwe L. Korn (JIRA)

 [ 
https://issues.apache.org/jira/browse/PARQUET-695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn updated PARQUET-695:

Component/s: parquet-cpp

> C++: Better default encoding user experience
> 
>
> Key: PARQUET-695
> URL: https://issues.apache.org/jira/browse/PARQUET-695
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-cpp
>Reporter: Uwe L. Korn
>
> Currently the default encoding is PLAIN. Probably making dictionary encoding 
> the default is the best choice and let the user select an alternative 
> encoding if the dictionary grows too large.
> The interface should be as follows:
>  * The user selects on a global and per-column basis if we should attempt 
> dictionary encoding a column. The selection if RLE_DICTIONARY or 
> PLAIN_DICTIONARY is used in the metadata is hidden from the user.
>  * The user specifies a fallback (!= dictionary) encoding that is used if 
> either dictionary encoding for a column is not desired or if the dictionary 
> grew exceeded its size limit.
> As a recap the current implement selects the encoding solely on the encoding 
> variable. There is no fallback support implemented if the dictionary grows 
> too large. The only magic at the moment is that the user can supply either 
> PLAIN_DICTIONARY or RLE_DICTIONARY and the enum that is used in the metadata 
> is the one which is suitable for the chosen Parquet version and not the one 
> supplied by the user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PARQUET-695) C++: Better default encoding user experience

2016-08-29 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-695:
---

 Summary: C++: Better default encoding user experience
 Key: PARQUET-695
 URL: https://issues.apache.org/jira/browse/PARQUET-695
 Project: Parquet
  Issue Type: Improvement
Reporter: Uwe L. Korn


Currently the default encoding is PLAIN. Probably making dictionary encoding 
the default is the best choice and let the user select an alternative encoding 
if the dictionary grows too large.

The interface should be as follows:

 * The user selects on a global and per-column basis if we should attempt 
dictionary encoding a column. The selection if RLE_DICTIONARY or 
PLAIN_DICTIONARY is used in the metadata is hidden from the user.
 * The user specifies a fallback (!= dictionary) encoding that is used if 
either dictionary encoding for a column is not desired or if the dictionary 
grew exceeded its size limit.

As a recap the current implement selects the encoding solely on the encoding 
variable. There is no fallback support implemented if the dictionary grows too 
large. The only magic at the moment is that the user can supply either 
PLAIN_DICTIONARY or RLE_DICTIONARY and the enum that is used in the metadata is 
the one which is suitable for the chosen Parquet version and not the one 
supplied by the user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Travis CI has been failing

2016-08-29 Thread Wes McKinney
Since googlecode project hosting seems to have completely shut down
(they had claimed that these downloads would be available the "rest of
2016"), you can use the download links from GitHub:

https://github.com/google/protobuf/releases/download/v2.5.0/protobuf-2.5.0.tar.bz2
cf https://github.com/google/protobuf/releases/tag/v2.5.0

- Wes

On Mon, Aug 29, 2016 at 6:32 AM, Liwei Lin  wrote:
> Hi,
>
> Traivs CI has been failing since two or three days ago; log shows that:
>
> The command "wget http://protobuf.googlecode.com/files/protobuf-2.5.0.tar.gz;
> failed and exited with 8 during .
>
> (please refer to https://travis-ci.org/apache/parquet-mr/jobs/155839958)
>
> It'd be great if we could get Traivs up, thanks!


[jira] [Created] (PARQUET-693) C++: Determine a good default page size

2016-08-29 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created PARQUET-693:
---

 Summary: C++: Determine a good default page size
 Key: PARQUET-693
 URL: https://issues.apache.org/jira/browse/PARQUET-693
 Project: Parquet
  Issue Type: Improvement
  Components: parquet-cpp
Reporter: Uwe L. Korn


Currently we have 1MB as the default data page size in parquet-cpp as in 
parquet-mr. We should communicate with the other parquet implementations if 
this is a good value and run benchmarks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)