Re: [VOTE] Release Apache Parquet C++ 1.0.0 RC3

2017-02-27 Thread Wes McKinney
Thanks Ryan, With 2 +1 and 1 -1 binding votes, the RC does not pass. We will prepare patches to address the feedback and make another RC. On Mon, Feb 27, 2017 at 12:52 PM, Ryan Blue wrote: > -1 (binding) > > There are a few issues with license/notice documentation: >

Re: parquet sync starting now

2017-02-27 Thread Greg Rahn
I think the decision comes down to how many TIMESTAMP types does Parquet (and systems that use it a format) want to support or the use cases that are being targeted. If the answer is two, then it makes sense to follow the ANSI standard and what Postgres et al. have done: - timestamp [ without

Re: parquet sync starting now

2017-02-27 Thread Marcel Kornacker
Greg, thanks for this writeup. Going back to "timestamp with timezone" in Parquet: does anything speak *against* following the SQL standard and storing UTC without an attached timezone (and leaving it to the client to do the conversion correctly for timestamp literals)? On Mon, Feb 27, 2017 at

Re: [DISCUSS] C++ code sharing amongst Apache {Arrow, Kudu, Impala, Parquet}

2017-02-27 Thread Leif Walsh
I also support the idea of creating an "apache commons modern c++" style library, maybe tailored toward the needs of columnar data processing tools. I think APR is the wrong project but I think that *style* of project is the right direction to aim. I agree this adds test and release process

Re: [DISCUSS] C++ code sharing amongst Apache {Arrow, Kudu, Impala, Parquet}

2017-02-27 Thread Leif Walsh
Julian, are you proposing the arrow project ship two artifacts, arrow-common and arrow, where arrow depends on arrow-common? On Mon, Feb 27, 2017 at 11:51 Julian Hyde wrote: > “Commons” projects are often problematic. It is difficult to tell what is > in scope and out of scope.

Re: parquet sync starting now

2017-02-27 Thread Marcel Kornacker
On Mon, Feb 27, 2017 at 10:43 AM, Zoltan Ivanfi wrote: > What you describe (storing in UTC and adjusting to local time) is the > implicit timezone that is associated with the plain TIMEZONE type of ANSI > SQL. Excerpts: Postgres allows explicit timezone offsets in timestamp

Re: parquet sync starting now

2017-02-27 Thread Zoltan Ivanfi
What you describe (storing in UTC and adjusting to local time) is the implicit timezone that is associated with the plain TIMEZONE type of ANSI SQL. Excerpts: Datetime data types that contain time fields (TIME and TIMESTAMP) are maintained in Universal Coordinated Time (UTC), with an explicit

Re: parquet sync starting now

2017-02-27 Thread Marcel Kornacker
On Mon, Feb 27, 2017 at 8:47 AM, Zoltan Ivanfi wrote: > Hi, > > Although the draft of SQL-92[1] does not explicitly state that the time zone > offset has to be stored, the following excerpts strongly suggest that the > time zone has to be stored with each individual value of

Re: Day of Sync-up

2017-02-27 Thread Marcel Kornacker
It sounds like that only leaves Wednesday. Any objections to that? On Mon, Feb 27, 2017 at 12:43 AM, Zoltan Ivanfi wrote: > I'm busy on Mondays and Tuesdays, the rest of the week is fine by me. > > Zoltan > > On Mon, Feb 27, 2017 at 8:28 AM Uwe L. Korn

Re: Day of Sync-up

2017-02-27 Thread Marcel Kornacker
Tuesday then? On Sat, Feb 25, 2017 at 3:49 PM, Deepak Majeti wrote: > Other days of the week work for me too. > > On Sat, Feb 25, 2017 at 3:31 PM, Wes McKinney wrote: >> >> Moving this to the Parquet mailing list. Other days of the week work >> OK

Re: [VOTE] Release Apache Parquet C++ 1.0.0 RC3

2017-02-27 Thread Ryan Blue
-1 (binding) There are a few issues with license/notice documentation: cpplint.py has a copyright notice, “Copyright (c) 2009 Google Inc.” and a 3-clause BSD license that is missing for LICENSE.txt NOTICE.txt contains license information that should only be in LICENSE.txt (the inclusion of PFOR

Re: [DISCUSS] C++ code sharing amongst Apache {Arrow, Kudu, Impala, Parquet}

2017-02-27 Thread Julian Hyde
“Commons” projects are often problematic. It is difficult to tell what is in scope and out of scope. If the scope is drawn too wide, there is a real problem of orphaned features, because people contribute one feature and then disappear. Let’s remember the Apache mantra: community over code. If

Re: parquet sync starting now

2017-02-27 Thread Zoltan Ivanfi
Hi, Although the draft of SQL-92[1] does not explicitly state that the time zone offset has to be stored, the following excerpts strongly suggest that the time zone has to be stored with each individual value of TIMESTAMP WITH TIME ZONE: The length of a TIMESTAMP is 19 positions [...] The

[jira] [Created] (PARQUET-899) Add metadata field describing the application that wrote the file

2017-02-27 Thread Zoltan Ivanfi (JIRA)
Zoltan Ivanfi created PARQUET-899: - Summary: Add metadata field describing the application that wrote the file Key: PARQUET-899 URL: https://issues.apache.org/jira/browse/PARQUET-899 Project: Parquet

Re: [DISCUSS] C++ code sharing amongst Apache {Arrow, Kudu, Impala, Parquet}

2017-02-27 Thread Wes McKinney
Responding to Todd's e-mail: 1) Open source release model My expectation is that this library would release about once a month, with occasional faster releases for critical fixes. 2) Governance/review model Beyond having centralized code reviews, it's hard to predict how the governance would

Re: Day of Sync-up

2017-02-27 Thread Zoltan Ivanfi
I'm busy on Mondays and Tuesdays, the rest of the week is fine by me. Zoltan On Mon, Feb 27, 2017 at 8:28 AM Uwe L. Korn wrote: > All weekdays except Friday work for me. > > -- > Uwe L. Korn > uw...@xhochy.com > > On Sun, Feb 26, 2017, at 12:49 AM, Deepak Majeti wrote: >