[jira] [Created] (ARROW-5026) [Python][Packaging] conda package on non Windows is broken

2019-03-26 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-5026:
---

 Summary: [Python][Packaging] conda package on non Windows is broken
 Key: ARROW-5026
 URL: https://issues.apache.org/jira/browse/ARROW-5026
 Project: Apache Arrow
  Issue Type: Bug
  Components: Packaging, Python
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou


https://travis-ci.org/kou/crossbow/builds/511831955

{noformat}
-- Could not find the Gandiva library. Looked for headers in $PREFIX/include, 
and for libs in $PREFIX/lib
CMake Error at CMakeLists.txt:509 (message):
  Unable to locate Gandiva libraries
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5025) [Packaging] wheel for Windows are broken

2019-03-26 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-5025:
---

 Summary: [Packaging] wheel for Windows are broken
 Key: ARROW-5025
 URL: https://issues.apache.org/jira/browse/ARROW-5025
 Project: Apache Arrow
  Issue Type: Bug
  Components: Packaging
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou
 Fix For: 0.13.0


[https://ci.appveyor.com/project/kou/crossbow/builds/23383931]


{noformat}
-- Found the Gandiva core library: 
C:/Miniconda35-x64/envs/arrow/Library/lib/gandiva.lib
CMake Error: File C:/Miniconda35-x64/envs/arrow/Library/lib/gandiva.dll does 
not exist.
CMake Error at CMakeLists.txt:218 (configure_file):
configure_file Problem configuring file
Call Stack (most recent call first):
CMakeLists.txt:518 (bundle_arrow_lib){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5024) [Release] crossbow.py --arrow-version causes missing variable error

2019-03-26 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-5024:
---

 Summary: [Release] crossbow.py --arrow-version causes missing 
variable error
 Key: ARROW-5024
 URL: https://issues.apache.org/jira/browse/ARROW-5024
 Project: Apache Arrow
  Issue Type: Bug
  Components: Packaging
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5023) [Release] Default value syntax in shell is wrong

2019-03-26 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-5023:
---

 Summary: [Release] Default value syntax in shell is wrong
 Key: ARROW-5023
 URL: https://issues.apache.org/jira/browse/ARROW-5023
 Project: Apache Arrow
  Issue Type: Bug
  Components: Packaging
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5022) [C++] Implement more "Datum" types for AggregateKernel

2019-03-26 Thread Philipp Moritz (JIRA)
Philipp Moritz created ARROW-5022:
-

 Summary: [C++] Implement more "Datum" types for AggregateKernel
 Key: ARROW-5022
 URL: https://issues.apache.org/jira/browse/ARROW-5022
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: Philipp Moritz


Currently it gives the following error if the datum isn't an array:
{code:java}
AggregateKernel expects Array datum{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Dask and Arrow Parquet Rewrite

2019-03-26 Thread Matthew Rocklin
Hi All,

A few months ago I started a rewrite of how Dask manages Parquet
reader/writers in an effort to simplify the system.  This work is here:
https://github.com/dask/dask/pull/4336

To summarize, Dask uses parquet reader libraries like pyarrow.parquet to
provide scalable reading of parquet datasets in parallel.  This requires
both information about how to encode and decode bytes, but also on how to
select row groups, grab data from S3/GCS/..., apply filters, find sorted
index columns, and so on that are more commonly critical in a distributed
setting.  Previously the relationship between the two libraries was
somewhat messy, where this logic was spread across in a haphazard way.

This PR tries to draw pretty strict lines between the two libraries and
establish a contract that hopefully we can stick to more easily in the
future.  For more information about that contract, I'd like to point people
to the github issue.

Things are looking pretty good so far, but there have been a few missing
features in Arrow that would be really nice to be able to complete this
rewrite.  In particular two things have come up so far (though I'm sure
that more will arise)

   1. The ability to write a metadata file, given metadata collected from
   writing each row group.  https://issues.apache.org/jira/browse/ARROW-1983
   2. Getting statistics from types like unicode and datetime that may be
   stored differently from how users interpret them.
   https://issues.apache.org/jira/browse/ARROW-4139

My hope is that if we can resolve a few issues like this then we'll be able
to simlify the relationship between the projects on both sides, reduce
maintenance burden, and hopefully add improve the overall experience as
well.

Best,
-matt

(this came up again in
https://github.com/apache/arrow/pull/3988#issuecomment-476696143)


[jira] [Created] (ARROW-5021) [C++] Review hardcoded "lib" paths in Find$PACKAGE.cmake related to endogenous libraries

2019-03-26 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-5021:
---

 Summary: [C++] Review hardcoded "lib" paths in Find$PACKAGE.cmake 
related to endogenous libraries
 Key: ARROW-5021
 URL: https://issues.apache.org/jira/browse/ARROW-5021
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, Python
Reporter: Wes McKinney
 Fix For: 0.14.0


See https://github.com/apache/arrow/pull/4024#discussion_r269200888. We should 
try to use a more portable pattern for finding these libraries (in the event 
that libraries are installed in {{lib64}}, for example)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5020) [C++][Gandiva] Split Gandiva-related conda packages for builds into separate .yml conda env file

2019-03-26 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-5020:
---

 Summary: [C++][Gandiva] Split Gandiva-related conda packages for 
builds into separate .yml conda env file
 Key: ARROW-5020
 URL: https://issues.apache.org/jira/browse/ARROW-5020
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, Continuous Integration
Reporter: Wes McKinney
 Fix For: 0.14.0


These installs are large and should not be required unconditionally in CI and 
elsewhere



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5019) [C#] ArrowStreamWriter doesn't work on a non-seekable stream

2019-03-26 Thread Eric Erhardt (JIRA)
Eric Erhardt created ARROW-5019:
---

 Summary: [C#] ArrowStreamWriter doesn't work on a non-seekable 
stream
 Key: ARROW-5019
 URL: https://issues.apache.org/jira/browse/ARROW-5019
 Project: Apache Arrow
  Issue Type: Bug
  Components: C#
Reporter: Eric Erhardt
Assignee: Eric Erhardt


When writing to a non-seekable .NET Stream (like a network/socket stream), 
ArrowStreamWriter will throw an exception:

 
{code:java}
Exception thrown: 'System.NotSupportedException' in System.Net.Sockets.dll
This stream does not support seek operations.
{code}
The reason this throws is because we are using `BastStream.Position` in the 
writer to calculate the length of bytes that we've written to the stream. We 
don't need to use the Position in order to calculate the lengths. We should be 
able to write an Arrow RecordBatch to a NetworkStream directly. Today, we need 
to write to a MemoryStream, and then copy the MemoryStream to the NetworkStream.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5018) [Release] Include JavaScript implementation

2019-03-26 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-5018:
---

 Summary: [Release] Include JavaScript implementation
 Key: ARROW-5018
 URL: https://issues.apache.org/jira/browse/ARROW-5018
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Packaging
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5017) [C++] [CI] Thrift not found on Azure Pipelines

2019-03-26 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-5017:
-

 Summary: [C++] [CI] Thrift not found on Azure Pipelines
 Key: ARROW-5017
 URL: https://issues.apache.org/jira/browse/ARROW-5017
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++, Continuous Integration
Reporter: Antoine Pitrou


I don't understand why this happens. The conda-forge package for {{thrift-cpp}} 
is the same as installed on AppVeyor, yet on Azure Pipelines the static library 
isn't found:
https://dev.azure.com/pitrou/arrow/_build/results?buildId=70

{code}
-- Checking for module 'thrift'
--   No package 'thrift' found
CMake Error at 
D:/a/1/conda-envs/arrow/Library/share/cmake-3.14/Modules/FindPackageHandleStandardArgs.cmake:137
 (message):
  Could NOT find Thrift (missing: THRIFT_STATIC_LIB)
Call Stack (most recent call first):
  
D:/a/1/conda-envs/arrow/Library/share/cmake-3.14/Modules/FindPackageHandleStandardArgs.cmake:378
 (_FPHSA_FAILURE_MESSAGE)
  cmake_modules/FindThrift.cmake:94 (find_package_handle_standard_args)
  cmake_modules/ThirdpartyToolchain.cmake:146 (find_package)
  cmake_modules/ThirdpartyToolchain.cmake:1076 (resolve_dependency)
  CMakeLists.txt:544 (include)
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Timeline for 0.13 Arrow release

2019-03-26 Thread Javier Luraschi
Great, thank you, Wes!

On Tue, Mar 26, 2019 at 12:45 PM Wes McKinney  wrote:

> Javier -- it was merged using our usual merge tool.
>
> On Tue, Mar 26, 2019, 8:38 PM Javier Luraschi  wrote:
>
> > https://github.com/apache/arrow/pull/4011 got closed but is needed for
> > 0.13
> > for the R package, could someone reopen please? We can wait for Romains
> to
> > give a LGTM and then merge. Thank you!
> >
> > On Tue, Mar 26, 2019 at 12:38 PM Kouhei Sutou 
> wrote:
> >
> > > Hi,
> > >
> > > We don't freeze anything.
> > > We can merge pull requests when they are ready to merge as usual.
> > >
> > >
> > > Thanks,
> > > --
> > > kou
> > >
> > > In <6531c437-6bf9-61cd-c7c6-7be06fa96...@python.org>
> > >   "Re: Timeline for 0.13 Arrow release" on Tue, 26 Mar 2019 14:58:06
> > +0100,
> > >   Antoine Pitrou  wrote:
> > >
> > > >
> > > > Hi Kou,
> > > >
> > > > What should be the policy for merges until 0.13.0 is released?
> > > > Do you want to instate a feature freeze or commit freeze?
> > > >
> > > > Regards
> > > >
> > > > Antoine.
> > > >
> > > >
> > > > Le 19/03/2019 à 15:46, Kouhei Sutou a écrit :
> > > >> Hi,
> > > >>
> > > >> There are no blockers on GLib, Ruby and Linux packages.
> > > >>
> > > >> Can we include JavaScript into 0.13.0?
> > > >> If we include JavaScript into 0.13.0, we can remove
> > > >> codes to release JavaScript separately. For example, we can
> > > >> remove dev/release/js-*. We can enable version update code
> > > >> in dev/release/00-prepare.sh:
> > > >>
> > >
> >
> https://github.com/apache/arrow/blob/master/dev/release/00-prepare.sh#L67-L74
> > > >>
> > > >> We can merge "JavaScript Releases" document into our release
> > > >> document:
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-JavaScriptReleases
> > > >>
> > > >>
> > > >> Thanks,
> > > >> --
> > > >> kou
> > > >>
> > > >> In <
> > cajpuwmbgjzbwrwybwse6bd9lnn_7xozn_aq2job9_mpvmhc...@mail.gmail.com>
> > > >>   "Re: Timeline for 0.13 Arrow release" on Mon, 18 Mar 2019 20:51:12
> > > -0500,
> > > >>   Wes McKinney  wrote:
> > > >>
> > > >>> hi folks,
> > > >>>
> > > >>> I think we're basically at the 0.13 end game here. There's some
> more
> > > >>> patches can get in, but do we all think we can cut an RC by the end
> > of
> > > >>> the week? What are the blocking issues?
> > > >>>
> > > >>> Thanks
> > > >>> Wes
> > > >>>
> > > >>> On Sat, Mar 16, 2019 at 9:57 PM Kouhei Sutou 
> > > wrote:
> > > 
> > >  Hi,
> > > 
> > > > Submitted the packaging builds:
> > > >
> > >
> >
> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452
> > > 
> > >  I've fixed .deb/.rpm packages:
> > > https://github.com/apache/arrow/pull/3934
> > >  It has been merged.
> > >  So .deb/.rpm packages are ready for release.
> > > 
> > >  Thanks,
> > >  --
> > >  kou
> > > 
> > >  In <
> > > cahm19a5somzxgcphc6ee-mr2usvvhwb252udgjrvocq-cb2...@mail.gmail.com>
> > >    "Re: Timeline for 0.13 Arrow release" on Thu, 14 Mar 2019
> 16:24:43
> > > +0100,
> > >    Krisztián Szűcs  wrote:
> > > 
> > > > Submitted the packaging builds:
> > > >
> > >
> >
> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452
> > > >
> > > > On Thu, Mar 14, 2019 at 4:19 PM Wes McKinney <
> wesmck...@gmail.com>
> > > wrote:
> > > >
> > > >> The CMake refactor is merged! Kudos to Uwe for 3+ weeks of hard
> > > labor on
> > > >> this.
> > > >>
> > > >> We should run all the packaging tasks and get a full accounting
> of
> > > >> what is broken so we aren't surprised during the release process
> > > >>
> > > >> On Wed, Mar 13, 2019 at 9:39 AM Krisztián Szűcs
> > > >>  wrote:
> > > >>>
> > > >>> The proof of the pudding is in the eating. You convinced me.
> > > >>>
> > > >>> On Wed, Mar 13, 2019 at 3:31 PM Wes McKinney <
> > wesmck...@gmail.com>
> > > >> wrote:
> > > >>>
> > >  Krisztian -- are you all right with proceeding with merging
> the
> > > CMake
> > >  refactor? I'm pretty committed to helping fix the problems
> that
> > > come
> > >  up. Since most consumers of the project don't test until
> > _after_ a
> > >  release, we won't find out about some problems until we merge
> it
> > > and
> > >  release it. Thus, IMHO it doesn't make sense to wait another
> > 8-10
> > >  weeks since we'd be delaying feedback for that long. There are
> > > also a
> > >  number of follow-on issues blocking on the refactor
> > > 
> > >  On Tue, Mar 12, 2019 at 11:39 AM Andy Grove <
> > > andygrov...@gmail.com>
> > > >> wrote:
> > > >
> > > > I've cleaned up my issues for Rust, moving most of them to
> > > 0.14.0.
> > > >
> > > > I have two PRs in progress that I would appreciate reviews
> on:
> > > >
> > > > https://gith

Re: Timeline for 0.13 Arrow release

2019-03-26 Thread Wes McKinney
Javier -- it was merged using our usual merge tool.

On Tue, Mar 26, 2019, 8:38 PM Javier Luraschi  wrote:

> https://github.com/apache/arrow/pull/4011 got closed but is needed for
> 0.13
> for the R package, could someone reopen please? We can wait for Romains to
> give a LGTM and then merge. Thank you!
>
> On Tue, Mar 26, 2019 at 12:38 PM Kouhei Sutou  wrote:
>
> > Hi,
> >
> > We don't freeze anything.
> > We can merge pull requests when they are ready to merge as usual.
> >
> >
> > Thanks,
> > --
> > kou
> >
> > In <6531c437-6bf9-61cd-c7c6-7be06fa96...@python.org>
> >   "Re: Timeline for 0.13 Arrow release" on Tue, 26 Mar 2019 14:58:06
> +0100,
> >   Antoine Pitrou  wrote:
> >
> > >
> > > Hi Kou,
> > >
> > > What should be the policy for merges until 0.13.0 is released?
> > > Do you want to instate a feature freeze or commit freeze?
> > >
> > > Regards
> > >
> > > Antoine.
> > >
> > >
> > > Le 19/03/2019 à 15:46, Kouhei Sutou a écrit :
> > >> Hi,
> > >>
> > >> There are no blockers on GLib, Ruby and Linux packages.
> > >>
> > >> Can we include JavaScript into 0.13.0?
> > >> If we include JavaScript into 0.13.0, we can remove
> > >> codes to release JavaScript separately. For example, we can
> > >> remove dev/release/js-*. We can enable version update code
> > >> in dev/release/00-prepare.sh:
> > >>
> >
> https://github.com/apache/arrow/blob/master/dev/release/00-prepare.sh#L67-L74
> > >>
> > >> We can merge "JavaScript Releases" document into our release
> > >> document:
> > >>
> >
> https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-JavaScriptReleases
> > >>
> > >>
> > >> Thanks,
> > >> --
> > >> kou
> > >>
> > >> In <
> cajpuwmbgjzbwrwybwse6bd9lnn_7xozn_aq2job9_mpvmhc...@mail.gmail.com>
> > >>   "Re: Timeline for 0.13 Arrow release" on Mon, 18 Mar 2019 20:51:12
> > -0500,
> > >>   Wes McKinney  wrote:
> > >>
> > >>> hi folks,
> > >>>
> > >>> I think we're basically at the 0.13 end game here. There's some more
> > >>> patches can get in, but do we all think we can cut an RC by the end
> of
> > >>> the week? What are the blocking issues?
> > >>>
> > >>> Thanks
> > >>> Wes
> > >>>
> > >>> On Sat, Mar 16, 2019 at 9:57 PM Kouhei Sutou 
> > wrote:
> > 
> >  Hi,
> > 
> > > Submitted the packaging builds:
> > >
> >
> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452
> > 
> >  I've fixed .deb/.rpm packages:
> > https://github.com/apache/arrow/pull/3934
> >  It has been merged.
> >  So .deb/.rpm packages are ready for release.
> > 
> >  Thanks,
> >  --
> >  kou
> > 
> >  In <
> > cahm19a5somzxgcphc6ee-mr2usvvhwb252udgjrvocq-cb2...@mail.gmail.com>
> >    "Re: Timeline for 0.13 Arrow release" on Thu, 14 Mar 2019 16:24:43
> > +0100,
> >    Krisztián Szűcs  wrote:
> > 
> > > Submitted the packaging builds:
> > >
> >
> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452
> > >
> > > On Thu, Mar 14, 2019 at 4:19 PM Wes McKinney 
> > wrote:
> > >
> > >> The CMake refactor is merged! Kudos to Uwe for 3+ weeks of hard
> > labor on
> > >> this.
> > >>
> > >> We should run all the packaging tasks and get a full accounting of
> > >> what is broken so we aren't surprised during the release process
> > >>
> > >> On Wed, Mar 13, 2019 at 9:39 AM Krisztián Szűcs
> > >>  wrote:
> > >>>
> > >>> The proof of the pudding is in the eating. You convinced me.
> > >>>
> > >>> On Wed, Mar 13, 2019 at 3:31 PM Wes McKinney <
> wesmck...@gmail.com>
> > >> wrote:
> > >>>
> >  Krisztian -- are you all right with proceeding with merging the
> > CMake
> >  refactor? I'm pretty committed to helping fix the problems that
> > come
> >  up. Since most consumers of the project don't test until
> _after_ a
> >  release, we won't find out about some problems until we merge it
> > and
> >  release it. Thus, IMHO it doesn't make sense to wait another
> 8-10
> >  weeks since we'd be delaying feedback for that long. There are
> > also a
> >  number of follow-on issues blocking on the refactor
> > 
> >  On Tue, Mar 12, 2019 at 11:39 AM Andy Grove <
> > andygrov...@gmail.com>
> > >> wrote:
> > >
> > > I've cleaned up my issues for Rust, moving most of them to
> > 0.14.0.
> > >
> > > I have two PRs in progress that I would appreciate reviews on:
> > >
> > > https://github.com/apache/arrow/pull/3671 - [Rust] Table API
> > (a.k.a
> > > DataFrame)
> > >
> > > https://github.com/apache/arrow/pull/3851 - [Rust] Parquet
> data
> > >> source
> >  in
> > > DataFusion
> > >
> > > Once these are merged I have some small follow up PRs for
> 0.13.0
> > >> that I
> >  can
> > > get done this week.
> > >
> > >

Re: Timeline for 0.13 Arrow release

2019-03-26 Thread Javier Luraschi
https://github.com/apache/arrow/pull/4011 got closed but is needed for 0.13
for the R package, could someone reopen please? We can wait for Romains to
give a LGTM and then merge. Thank you!

On Tue, Mar 26, 2019 at 12:38 PM Kouhei Sutou  wrote:

> Hi,
>
> We don't freeze anything.
> We can merge pull requests when they are ready to merge as usual.
>
>
> Thanks,
> --
> kou
>
> In <6531c437-6bf9-61cd-c7c6-7be06fa96...@python.org>
>   "Re: Timeline for 0.13 Arrow release" on Tue, 26 Mar 2019 14:58:06 +0100,
>   Antoine Pitrou  wrote:
>
> >
> > Hi Kou,
> >
> > What should be the policy for merges until 0.13.0 is released?
> > Do you want to instate a feature freeze or commit freeze?
> >
> > Regards
> >
> > Antoine.
> >
> >
> > Le 19/03/2019 à 15:46, Kouhei Sutou a écrit :
> >> Hi,
> >>
> >> There are no blockers on GLib, Ruby and Linux packages.
> >>
> >> Can we include JavaScript into 0.13.0?
> >> If we include JavaScript into 0.13.0, we can remove
> >> codes to release JavaScript separately. For example, we can
> >> remove dev/release/js-*. We can enable version update code
> >> in dev/release/00-prepare.sh:
> >>
> https://github.com/apache/arrow/blob/master/dev/release/00-prepare.sh#L67-L74
> >>
> >> We can merge "JavaScript Releases" document into our release
> >> document:
> >>
> https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-JavaScriptReleases
> >>
> >>
> >> Thanks,
> >> --
> >> kou
> >>
> >> In 
> >>   "Re: Timeline for 0.13 Arrow release" on Mon, 18 Mar 2019 20:51:12
> -0500,
> >>   Wes McKinney  wrote:
> >>
> >>> hi folks,
> >>>
> >>> I think we're basically at the 0.13 end game here. There's some more
> >>> patches can get in, but do we all think we can cut an RC by the end of
> >>> the week? What are the blocking issues?
> >>>
> >>> Thanks
> >>> Wes
> >>>
> >>> On Sat, Mar 16, 2019 at 9:57 PM Kouhei Sutou 
> wrote:
> 
>  Hi,
> 
> > Submitted the packaging builds:
> >
> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452
> 
>  I've fixed .deb/.rpm packages:
> https://github.com/apache/arrow/pull/3934
>  It has been merged.
>  So .deb/.rpm packages are ready for release.
> 
>  Thanks,
>  --
>  kou
> 
>  In <
> cahm19a5somzxgcphc6ee-mr2usvvhwb252udgjrvocq-cb2...@mail.gmail.com>
>    "Re: Timeline for 0.13 Arrow release" on Thu, 14 Mar 2019 16:24:43
> +0100,
>    Krisztián Szűcs  wrote:
> 
> > Submitted the packaging builds:
> >
> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452
> >
> > On Thu, Mar 14, 2019 at 4:19 PM Wes McKinney 
> wrote:
> >
> >> The CMake refactor is merged! Kudos to Uwe for 3+ weeks of hard
> labor on
> >> this.
> >>
> >> We should run all the packaging tasks and get a full accounting of
> >> what is broken so we aren't surprised during the release process
> >>
> >> On Wed, Mar 13, 2019 at 9:39 AM Krisztián Szűcs
> >>  wrote:
> >>>
> >>> The proof of the pudding is in the eating. You convinced me.
> >>>
> >>> On Wed, Mar 13, 2019 at 3:31 PM Wes McKinney 
> >> wrote:
> >>>
>  Krisztian -- are you all right with proceeding with merging the
> CMake
>  refactor? I'm pretty committed to helping fix the problems that
> come
>  up. Since most consumers of the project don't test until _after_ a
>  release, we won't find out about some problems until we merge it
> and
>  release it. Thus, IMHO it doesn't make sense to wait another 8-10
>  weeks since we'd be delaying feedback for that long. There are
> also a
>  number of follow-on issues blocking on the refactor
> 
>  On Tue, Mar 12, 2019 at 11:39 AM Andy Grove <
> andygrov...@gmail.com>
> >> wrote:
> >
> > I've cleaned up my issues for Rust, moving most of them to
> 0.14.0.
> >
> > I have two PRs in progress that I would appreciate reviews on:
> >
> > https://github.com/apache/arrow/pull/3671 - [Rust] Table API
> (a.k.a
> > DataFrame)
> >
> > https://github.com/apache/arrow/pull/3851 - [Rust] Parquet data
> >> source
>  in
> > DataFusion
> >
> > Once these are merged I have some small follow up PRs for 0.13.0
> >> that I
>  can
> > get done this week.
> >
> > Thanks,
> >
> > Andy.
> >
> >
> > On Tue, Mar 12, 2019 at 8:21 AM Wes McKinney <
> wesmck...@gmail.com>
>  wrote:
> >
> >> hi folks,
> >>
> >> I think we are on track to be able to release toward the end of
> >> this
> >> month. My proposed timeline:
> >>
> >> * This week (March 11-15): feature/improvement push mostly
> >> * Next week (March 18-22): shift to bug fixes, stabilization,
> empty
> >

Re: Timeline for 0.13 Arrow release

2019-03-26 Thread Kouhei Sutou
Hi,

We don't freeze anything.
We can merge pull requests when they are ready to merge as usual.


Thanks,
--
kou

In <6531c437-6bf9-61cd-c7c6-7be06fa96...@python.org>
  "Re: Timeline for 0.13 Arrow release" on Tue, 26 Mar 2019 14:58:06 +0100,
  Antoine Pitrou  wrote:

> 
> Hi Kou,
> 
> What should be the policy for merges until 0.13.0 is released?
> Do you want to instate a feature freeze or commit freeze?
> 
> Regards
> 
> Antoine.
> 
> 
> Le 19/03/2019 à 15:46, Kouhei Sutou a écrit :
>> Hi,
>> 
>> There are no blockers on GLib, Ruby and Linux packages.
>> 
>> Can we include JavaScript into 0.13.0?
>> If we include JavaScript into 0.13.0, we can remove
>> codes to release JavaScript separately. For example, we can
>> remove dev/release/js-*. We can enable version update code
>> in dev/release/00-prepare.sh:
>> https://github.com/apache/arrow/blob/master/dev/release/00-prepare.sh#L67-L74
>> 
>> We can merge "JavaScript Releases" document into our release
>> document:
>> https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-JavaScriptReleases
>> 
>> 
>> Thanks,
>> --
>> kou
>> 
>> In 
>>   "Re: Timeline for 0.13 Arrow release" on Mon, 18 Mar 2019 20:51:12 -0500,
>>   Wes McKinney  wrote:
>> 
>>> hi folks,
>>>
>>> I think we're basically at the 0.13 end game here. There's some more
>>> patches can get in, but do we all think we can cut an RC by the end of
>>> the week? What are the blocking issues?
>>>
>>> Thanks
>>> Wes
>>>
>>> On Sat, Mar 16, 2019 at 9:57 PM Kouhei Sutou  wrote:

 Hi,

> Submitted the packaging builds:
> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452

 I've fixed .deb/.rpm packages: https://github.com/apache/arrow/pull/3934
 It has been merged.
 So .deb/.rpm packages are ready for release.

 Thanks,
 --
 kou

 In 
   "Re: Timeline for 0.13 Arrow release" on Thu, 14 Mar 2019 16:24:43 +0100,
   Krisztián Szűcs  wrote:

> Submitted the packaging builds:
> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452
>
> On Thu, Mar 14, 2019 at 4:19 PM Wes McKinney  wrote:
>
>> The CMake refactor is merged! Kudos to Uwe for 3+ weeks of hard labor on
>> this.
>>
>> We should run all the packaging tasks and get a full accounting of
>> what is broken so we aren't surprised during the release process
>>
>> On Wed, Mar 13, 2019 at 9:39 AM Krisztián Szűcs
>>  wrote:
>>>
>>> The proof of the pudding is in the eating. You convinced me.
>>>
>>> On Wed, Mar 13, 2019 at 3:31 PM Wes McKinney 
>> wrote:
>>>
 Krisztian -- are you all right with proceeding with merging the CMake
 refactor? I'm pretty committed to helping fix the problems that come
 up. Since most consumers of the project don't test until _after_ a
 release, we won't find out about some problems until we merge it and
 release it. Thus, IMHO it doesn't make sense to wait another 8-10
 weeks since we'd be delaying feedback for that long. There are also a
 number of follow-on issues blocking on the refactor

 On Tue, Mar 12, 2019 at 11:39 AM Andy Grove 
>> wrote:
>
> I've cleaned up my issues for Rust, moving most of them to 0.14.0.
>
> I have two PRs in progress that I would appreciate reviews on:
>
> https://github.com/apache/arrow/pull/3671 - [Rust] Table API (a.k.a
> DataFrame)
>
> https://github.com/apache/arrow/pull/3851 - [Rust] Parquet data
>> source
 in
> DataFusion
>
> Once these are merged I have some small follow up PRs for 0.13.0
>> that I
 can
> get done this week.
>
> Thanks,
>
> Andy.
>
>
> On Tue, Mar 12, 2019 at 8:21 AM Wes McKinney 
 wrote:
>
>> hi folks,
>>
>> I think we are on track to be able to release toward the end of
>> this
>> month. My proposed timeline:
>>
>> * This week (March 11-15): feature/improvement push mostly
>> * Next week (March 18-22): shift to bug fixes, stabilization, empty
>> backlog of feature/improvement JIRAs
>> * Week of March 25: propose release candidate
>>
>> Does this seem reasonable? This puts us at about 9-10 weeks from
>> 0.12.
>>
>> We need an RM for 0.13, any PMCs want to volunteer?
>>
>> Take a look at our release page:
>>
>>

>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103091219
>>
>> Out of the open or in-progress issues, we have:
>>
>> * C#: 3 issues
>> * C++ (all components): 51 issues
>> * Java: 3 issues
>> * Python: 38 issues
>> * Rust 

[jira] [Created] (ARROW-5016) Failed to convert 'float' to 'double' with using pandas_udf and pyspark

2019-03-26 Thread Dat Nguyen (JIRA)
Dat Nguyen created ARROW-5016:
-

 Summary: Failed to convert 'float' to 'double' with using 
pandas_udf and pyspark
 Key: ARROW-5016
 URL: https://issues.apache.org/jira/browse/ARROW-5016
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Affects Versions: 0.12.1
 Environment: Linux 68b0517ddf1c 3.10.0-862.11.6.el7.x86_64 #1 SMP 
GNU/Linux

Reporter: Dat Nguyen


Hi everyone,

I would like to report a (potential) bug. I followed an official guide on 
[Usage Guide for Pandas with Apache 
Arrow]([https://spark.apache.org/docs/2.4.0/sql-pyspark-pandas-with-arrow.html)].

 However, `libarrrow` throws me error for type conversion from float -> double. 
Here is the example and its output.

 pyarrow==0.12.1

{code:title=reproduce_bug.py}
from pyspark.sql import SparkSession, SQLContext
from pyspark.sql.functions import pandas_udf, PandasUDFType, col

spark = SparkSession.builder.appName('ReproduceBug') .getOrCreate()

df = spark.createDataFrame(
[(1, "a"), (1, "a"), (1, "b")],
("id", "value"))
df.show()
# Spark DataFrame
# +---+-+
# | id|value|
# +---+-+
# |  1|a|
# |  1|a|
# |  1|b|
# +---+-+

# Potential Bug # 
@pandas_udf('double', PandasUDFType.GROUPED_AGG)
def compute_frequencies(sha256):
total  = sha256.count()
per_groups = sha256.groupby(sha256).transform('count')
score  = per_groups / total
return score

df.groupBy("id")\
  .agg(compute_frequencies(col('value')))\
  .show()

spark.stop()
{code}
 
{code:title=output}
---
Py4JJavaError Traceback (most recent call last)
 in 
 32 
 33 df.groupBy("id")\
---> 34   .agg(compute_frequencies(col('value')))\
 35   .show()
 36 

/usr/local/spark/python/pyspark/sql/dataframe.py in show(self, n, truncate, 
vertical)
376 """
377 if isinstance(truncate, bool) and truncate:
--> 378 print(self._jdf.showString(n, 20, vertical))
379 else:
380 print(self._jdf.showString(n, int(truncate), vertical))

/usr/local/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py in 
__call__(self, *args)
   1255 answer = self.gateway_client.send_command(command)
   1256 return_value = get_return_value(
-> 1257 answer, self.gateway_client, self.target_id, self.name)
   1258 
   1259 for temp_arg in temp_args:

/usr/local/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
 61 def deco(*a, **kw):
 62 try:
---> 63 return f(*a, **kw)
 64 except py4j.protocol.Py4JJavaError as e:
 65 s = e.java_exception.toString()

/usr/local/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py in 
get_return_value(answer, gateway_client, target_id, name)
326 raise Py4JJavaError(
327 "An error occurred while calling {0}{1}{2}.\n".
--> 328 format(target_id, ".", name), value)
329 else:
330 raise Py4JError(

Py4JJavaError: An error occurred while calling o186.showString.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 44 in 
stage 23.0 failed 1 times, most recent failure: Lost task 44.0 in stage 23.0 
(TID 601, localhost, executor driver): 
org.apache.spark.api.python.PythonException: Traceback (most recent call last):
  File "/usr/local/spark/python/lib/pyspark.zip/pyspark/worker.py", line 372, 
in main
process()
  File "/usr/local/spark/python/lib/pyspark.zip/pyspark/worker.py", line 367, 
in process
serializer.dump_stream(func(split_index, iterator), outfile)
  File "/usr/local/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 
284, in dump_stream
batch = _create_batch(series, self._timezone)
  File "/usr/local/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 
253, in _create_batch
arrs = [create_array(s, t) for s, t in series]
  File "/usr/local/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 
253, in 
arrs = [create_array(s, t) for s, t in series]
  File "/usr/local/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 
251, in create_array
return pa.Array.from_pandas(s, mask=mask, type=t)
  File "pyarrow/array.pxi", line 536, in pyarrow.lib.Array.from_pandas
  File "pyarrow/array.pxi", line 176, in pyarrow.lib.array
  File "pyarrow/array.pxi", line 85, in pyarrow.lib._ndarray_to_array
  File "pyarrow/error.pxi", line 81, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Could not convert 00.67
10.67
20.33
Name: _0, dtype: float64 with type Series: tried to convert to double
{code}

Please let me know if you would like to know more any further information. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5015) [R] Validate winlibs binaries with hash

2019-03-26 Thread Javier Luraschi (JIRA)
Javier Luraschi created ARROW-5015:
--

 Summary: [R] Validate winlibs binaries with hash
 Key: ARROW-5015
 URL: https://issues.apache.org/jira/browse/ARROW-5015
 Project: Apache Arrow
  Issue Type: Improvement
  Components: R
Reporter: Javier Luraschi


See [https://github.com/apache/arrow/pull/4011#discussion_r269229280]

It is a common practice to download binaries from the winlibs R repo; however, 
for the arrow project, validating the package against a SHA256/SHA512 hash is 
desired.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Timeline for 0.13 Arrow release

2019-03-26 Thread Antoine Pitrou


https://github.com/apache/arrow/pull/4005 is ready to go in as well
(pending CI), but can also be deferred to 0.14.

Regards

Antoine.


Le 26/03/2019 à 18:49, Wes McKinney a écrit :
> OK, I reviewed everything, and ARROW-4646 [1] just needs to be merged,
> then I think that is a wrap if nothing else pops up
> 
> [1]: https://github.com/apache/arrow/pull/4024
> 
> On Tue, Mar 26, 2019 at 12:31 PM Wes McKinney  wrote:
>>
>> Thanks Krisz -- I will take a look at those also now
>>
>> On Tue, Mar 26, 2019 at 12:27 PM Krisztián Szűcs
>>  wrote:
>>>
>>> On Tue, Mar 26, 2019 at 6:19 PM Wes McKinney  wrote:
>>>
 I think it's OK to keep merging patches (within reason) until an RC is
 cut so long as the build isn't broken.

 It looks like we have 3 issues marked still for 0.13

 ARROW-4645: Gandiva in wheels
 ARROW-4646: Gandiva in conda packages
 ARROW-4995: Windows build support for R

 If 4645/4646 are not merge-ready by end of day today I think this work
 can be pushed into 0.14 since Gandiva is not blocking much user-facing
 functionality at the moment

>>> Note that 4645/4646 also contains the fixes for the CMake rewrite, without
>>> these PRs the packaging builds will fail. The conda-forge builds can be
>>> altered after the release, but the wheels can't. BTW they are ready to
>>> merge.
>>>

 ARROW-4995 should be merged to give the R folks a fighting chance at a
 CRAN submission after the release. I'll take a look at that now

 Anything else that must go in?

 - Wes

 On Tue, Mar 26, 2019 at 8:58 AM Antoine Pitrou  wrote:
>
>
> Hi Kou,
>
> What should be the policy for merges until 0.13.0 is released?
> Do you want to instate a feature freeze or commit freeze?
>
> Regards
>
> Antoine.
>
>
> Le 19/03/2019 à 15:46, Kouhei Sutou a écrit :
>> Hi,
>>
>> There are no blockers on GLib, Ruby and Linux packages.
>>
>> Can we include JavaScript into 0.13.0?
>> If we include JavaScript into 0.13.0, we can remove
>> codes to release JavaScript separately. For example, we can
>> remove dev/release/js-*. We can enable version update code
>> in dev/release/00-prepare.sh:
>>
 https://github.com/apache/arrow/blob/master/dev/release/00-prepare.sh#L67-L74
>>
>> We can merge "JavaScript Releases" document into our release
>> document:
>>
 https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-JavaScriptReleases
>>
>>
>> Thanks,
>> --
>> kou
>>
>> In 
>>   "Re: Timeline for 0.13 Arrow release" on Mon, 18 Mar 2019 20:51:12
 -0500,
>>   Wes McKinney  wrote:
>>
>>> hi folks,
>>>
>>> I think we're basically at the 0.13 end game here. There's some more
>>> patches can get in, but do we all think we can cut an RC by the end of
>>> the week? What are the blocking issues?
>>>
>>> Thanks
>>> Wes
>>>
>>> On Sat, Mar 16, 2019 at 9:57 PM Kouhei Sutou 
 wrote:

 Hi,

> Submitted the packaging builds:
>
 https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452

 I've fixed .deb/.rpm packages:
 https://github.com/apache/arrow/pull/3934
 It has been merged.
 So .deb/.rpm packages are ready for release.

 Thanks,
 --
 kou

 In <
 cahm19a5somzxgcphc6ee-mr2usvvhwb252udgjrvocq-cb2...@mail.gmail.com>
   "Re: Timeline for 0.13 Arrow release" on Thu, 14 Mar 2019 16:24:43
 +0100,
   Krisztián Szűcs  wrote:

> Submitted the packaging builds:
>
 https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452
>
> On Thu, Mar 14, 2019 at 4:19 PM Wes McKinney 
 wrote:
>
>> The CMake refactor is merged! Kudos to Uwe for 3+ weeks of hard
 labor on
>> this.
>>
>> We should run all the packaging tasks and get a full accounting of
>> what is broken so we aren't surprised during the release process
>>
>> On Wed, Mar 13, 2019 at 9:39 AM Krisztián Szűcs
>>  wrote:
>>>
>>> The proof of the pudding is in the eating. You convinced me.
>>>
>>> On Wed, Mar 13, 2019 at 3:31 PM Wes McKinney 
>> wrote:
>>>
 Krisztian -- are you all right with proceeding with merging the
 CMake
 refactor? I'm pretty committed to helping fix the problems that
 come
 up. Since most consumers of the project don't test until _after_
 a
 release, we won't find out about some problems until we merge it
 and
 release it. Thus, IMHO it doesn't make sense to wait another 8-10
 weeks since we'd be delaying feedback for th

Re: Timeline for 0.13 Arrow release

2019-03-26 Thread Wes McKinney
OK, I reviewed everything, and ARROW-4646 [1] just needs to be merged,
then I think that is a wrap if nothing else pops up

[1]: https://github.com/apache/arrow/pull/4024

On Tue, Mar 26, 2019 at 12:31 PM Wes McKinney  wrote:
>
> Thanks Krisz -- I will take a look at those also now
>
> On Tue, Mar 26, 2019 at 12:27 PM Krisztián Szűcs
>  wrote:
> >
> > On Tue, Mar 26, 2019 at 6:19 PM Wes McKinney  wrote:
> >
> > > I think it's OK to keep merging patches (within reason) until an RC is
> > > cut so long as the build isn't broken.
> > >
> > > It looks like we have 3 issues marked still for 0.13
> > >
> > > ARROW-4645: Gandiva in wheels
> > > ARROW-4646: Gandiva in conda packages
> > > ARROW-4995: Windows build support for R
> > >
> > > If 4645/4646 are not merge-ready by end of day today I think this work
> > > can be pushed into 0.14 since Gandiva is not blocking much user-facing
> > > functionality at the moment
> > >
> > Note that 4645/4646 also contains the fixes for the CMake rewrite, without
> > these PRs the packaging builds will fail. The conda-forge builds can be
> > altered after the release, but the wheels can't. BTW they are ready to
> > merge.
> >
> > >
> > > ARROW-4995 should be merged to give the R folks a fighting chance at a
> > > CRAN submission after the release. I'll take a look at that now
> > >
> > > Anything else that must go in?
> > >
> > > - Wes
> > >
> > > On Tue, Mar 26, 2019 at 8:58 AM Antoine Pitrou  wrote:
> > > >
> > > >
> > > > Hi Kou,
> > > >
> > > > What should be the policy for merges until 0.13.0 is released?
> > > > Do you want to instate a feature freeze or commit freeze?
> > > >
> > > > Regards
> > > >
> > > > Antoine.
> > > >
> > > >
> > > > Le 19/03/2019 à 15:46, Kouhei Sutou a écrit :
> > > > > Hi,
> > > > >
> > > > > There are no blockers on GLib, Ruby and Linux packages.
> > > > >
> > > > > Can we include JavaScript into 0.13.0?
> > > > > If we include JavaScript into 0.13.0, we can remove
> > > > > codes to release JavaScript separately. For example, we can
> > > > > remove dev/release/js-*. We can enable version update code
> > > > > in dev/release/00-prepare.sh:
> > > > >
> > > https://github.com/apache/arrow/blob/master/dev/release/00-prepare.sh#L67-L74
> > > > >
> > > > > We can merge "JavaScript Releases" document into our release
> > > > > document:
> > > > >
> > > https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-JavaScriptReleases
> > > > >
> > > > >
> > > > > Thanks,
> > > > > --
> > > > > kou
> > > > >
> > > > > In  > > >
> > > > >   "Re: Timeline for 0.13 Arrow release" on Mon, 18 Mar 2019 20:51:12
> > > -0500,
> > > > >   Wes McKinney  wrote:
> > > > >
> > > > >> hi folks,
> > > > >>
> > > > >> I think we're basically at the 0.13 end game here. There's some more
> > > > >> patches can get in, but do we all think we can cut an RC by the end 
> > > > >> of
> > > > >> the week? What are the blocking issues?
> > > > >>
> > > > >> Thanks
> > > > >> Wes
> > > > >>
> > > > >> On Sat, Mar 16, 2019 at 9:57 PM Kouhei Sutou 
> > > wrote:
> > > > >>>
> > > > >>> Hi,
> > > > >>>
> > > >  Submitted the packaging builds:
> > > > 
> > > https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452
> > > > >>>
> > > > >>> I've fixed .deb/.rpm packages:
> > > https://github.com/apache/arrow/pull/3934
> > > > >>> It has been merged.
> > > > >>> So .deb/.rpm packages are ready for release.
> > > > >>>
> > > > >>> Thanks,
> > > > >>> --
> > > > >>> kou
> > > > >>>
> > > > >>> In <
> > > cahm19a5somzxgcphc6ee-mr2usvvhwb252udgjrvocq-cb2...@mail.gmail.com>
> > > > >>>   "Re: Timeline for 0.13 Arrow release" on Thu, 14 Mar 2019 16:24:43
> > > +0100,
> > > > >>>   Krisztián Szűcs  wrote:
> > > > >>>
> > > >  Submitted the packaging builds:
> > > > 
> > > https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452
> > > > 
> > > >  On Thu, Mar 14, 2019 at 4:19 PM Wes McKinney 
> > > wrote:
> > > > 
> > > > > The CMake refactor is merged! Kudos to Uwe for 3+ weeks of hard
> > > labor on
> > > > > this.
> > > > >
> > > > > We should run all the packaging tasks and get a full accounting of
> > > > > what is broken so we aren't surprised during the release process
> > > > >
> > > > > On Wed, Mar 13, 2019 at 9:39 AM Krisztián Szűcs
> > > > >  wrote:
> > > > >>
> > > > >> The proof of the pudding is in the eating. You convinced me.
> > > > >>
> > > > >> On Wed, Mar 13, 2019 at 3:31 PM Wes McKinney  > > >
> > > > > wrote:
> > > > >>
> > > > >>> Krisztian -- are you all right with proceeding with merging the
> > > CMake
> > > > >>> refactor? I'm pretty committed to helping fix the problems that
> > > come
> > > > >>> up. Since most consumers of the project don't test until _after_
> > > a
> > > > >>> release, we won't find out about some problems until we merge it
> > > and
> > > > >>> release it. Thus, 

Re: Timeline for 0.13 Arrow release

2019-03-26 Thread Wes McKinney
Thanks Krisz -- I will take a look at those also now

On Tue, Mar 26, 2019 at 12:27 PM Krisztián Szűcs
 wrote:
>
> On Tue, Mar 26, 2019 at 6:19 PM Wes McKinney  wrote:
>
> > I think it's OK to keep merging patches (within reason) until an RC is
> > cut so long as the build isn't broken.
> >
> > It looks like we have 3 issues marked still for 0.13
> >
> > ARROW-4645: Gandiva in wheels
> > ARROW-4646: Gandiva in conda packages
> > ARROW-4995: Windows build support for R
> >
> > If 4645/4646 are not merge-ready by end of day today I think this work
> > can be pushed into 0.14 since Gandiva is not blocking much user-facing
> > functionality at the moment
> >
> Note that 4645/4646 also contains the fixes for the CMake rewrite, without
> these PRs the packaging builds will fail. The conda-forge builds can be
> altered after the release, but the wheels can't. BTW they are ready to
> merge.
>
> >
> > ARROW-4995 should be merged to give the R folks a fighting chance at a
> > CRAN submission after the release. I'll take a look at that now
> >
> > Anything else that must go in?
> >
> > - Wes
> >
> > On Tue, Mar 26, 2019 at 8:58 AM Antoine Pitrou  wrote:
> > >
> > >
> > > Hi Kou,
> > >
> > > What should be the policy for merges until 0.13.0 is released?
> > > Do you want to instate a feature freeze or commit freeze?
> > >
> > > Regards
> > >
> > > Antoine.
> > >
> > >
> > > Le 19/03/2019 à 15:46, Kouhei Sutou a écrit :
> > > > Hi,
> > > >
> > > > There are no blockers on GLib, Ruby and Linux packages.
> > > >
> > > > Can we include JavaScript into 0.13.0?
> > > > If we include JavaScript into 0.13.0, we can remove
> > > > codes to release JavaScript separately. For example, we can
> > > > remove dev/release/js-*. We can enable version update code
> > > > in dev/release/00-prepare.sh:
> > > >
> > https://github.com/apache/arrow/blob/master/dev/release/00-prepare.sh#L67-L74
> > > >
> > > > We can merge "JavaScript Releases" document into our release
> > > > document:
> > > >
> > https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-JavaScriptReleases
> > > >
> > > >
> > > > Thanks,
> > > > --
> > > > kou
> > > >
> > > > In  > >
> > > >   "Re: Timeline for 0.13 Arrow release" on Mon, 18 Mar 2019 20:51:12
> > -0500,
> > > >   Wes McKinney  wrote:
> > > >
> > > >> hi folks,
> > > >>
> > > >> I think we're basically at the 0.13 end game here. There's some more
> > > >> patches can get in, but do we all think we can cut an RC by the end of
> > > >> the week? What are the blocking issues?
> > > >>
> > > >> Thanks
> > > >> Wes
> > > >>
> > > >> On Sat, Mar 16, 2019 at 9:57 PM Kouhei Sutou 
> > wrote:
> > > >>>
> > > >>> Hi,
> > > >>>
> > >  Submitted the packaging builds:
> > > 
> > https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452
> > > >>>
> > > >>> I've fixed .deb/.rpm packages:
> > https://github.com/apache/arrow/pull/3934
> > > >>> It has been merged.
> > > >>> So .deb/.rpm packages are ready for release.
> > > >>>
> > > >>> Thanks,
> > > >>> --
> > > >>> kou
> > > >>>
> > > >>> In <
> > cahm19a5somzxgcphc6ee-mr2usvvhwb252udgjrvocq-cb2...@mail.gmail.com>
> > > >>>   "Re: Timeline for 0.13 Arrow release" on Thu, 14 Mar 2019 16:24:43
> > +0100,
> > > >>>   Krisztián Szűcs  wrote:
> > > >>>
> > >  Submitted the packaging builds:
> > > 
> > https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452
> > > 
> > >  On Thu, Mar 14, 2019 at 4:19 PM Wes McKinney 
> > wrote:
> > > 
> > > > The CMake refactor is merged! Kudos to Uwe for 3+ weeks of hard
> > labor on
> > > > this.
> > > >
> > > > We should run all the packaging tasks and get a full accounting of
> > > > what is broken so we aren't surprised during the release process
> > > >
> > > > On Wed, Mar 13, 2019 at 9:39 AM Krisztián Szűcs
> > > >  wrote:
> > > >>
> > > >> The proof of the pudding is in the eating. You convinced me.
> > > >>
> > > >> On Wed, Mar 13, 2019 at 3:31 PM Wes McKinney  > >
> > > > wrote:
> > > >>
> > > >>> Krisztian -- are you all right with proceeding with merging the
> > CMake
> > > >>> refactor? I'm pretty committed to helping fix the problems that
> > come
> > > >>> up. Since most consumers of the project don't test until _after_
> > a
> > > >>> release, we won't find out about some problems until we merge it
> > and
> > > >>> release it. Thus, IMHO it doesn't make sense to wait another 8-10
> > > >>> weeks since we'd be delaying feedback for that long. There are
> > also a
> > > >>> number of follow-on issues blocking on the refactor
> > > >>>
> > > >>> On Tue, Mar 12, 2019 at 11:39 AM Andy Grove <
> > andygrov...@gmail.com>
> > > > wrote:
> > > 
> > >  I've cleaned up my issues for Rust, moving most of them to
> > 0.14.0.
> > > 
> > >  I have two PRs in progress that I would appreciate reviews on:
> >

Re: Timeline for 0.13 Arrow release

2019-03-26 Thread Krisztián Szűcs
On Tue, Mar 26, 2019 at 6:19 PM Wes McKinney  wrote:

> I think it's OK to keep merging patches (within reason) until an RC is
> cut so long as the build isn't broken.
>
> It looks like we have 3 issues marked still for 0.13
>
> ARROW-4645: Gandiva in wheels
> ARROW-4646: Gandiva in conda packages
> ARROW-4995: Windows build support for R
>
> If 4645/4646 are not merge-ready by end of day today I think this work
> can be pushed into 0.14 since Gandiva is not blocking much user-facing
> functionality at the moment
>
Note that 4645/4646 also contains the fixes for the CMake rewrite, without
these PRs the packaging builds will fail. The conda-forge builds can be
altered after the release, but the wheels can't. BTW they are ready to
merge.

>
> ARROW-4995 should be merged to give the R folks a fighting chance at a
> CRAN submission after the release. I'll take a look at that now
>
> Anything else that must go in?
>
> - Wes
>
> On Tue, Mar 26, 2019 at 8:58 AM Antoine Pitrou  wrote:
> >
> >
> > Hi Kou,
> >
> > What should be the policy for merges until 0.13.0 is released?
> > Do you want to instate a feature freeze or commit freeze?
> >
> > Regards
> >
> > Antoine.
> >
> >
> > Le 19/03/2019 à 15:46, Kouhei Sutou a écrit :
> > > Hi,
> > >
> > > There are no blockers on GLib, Ruby and Linux packages.
> > >
> > > Can we include JavaScript into 0.13.0?
> > > If we include JavaScript into 0.13.0, we can remove
> > > codes to release JavaScript separately. For example, we can
> > > remove dev/release/js-*. We can enable version update code
> > > in dev/release/00-prepare.sh:
> > >
> https://github.com/apache/arrow/blob/master/dev/release/00-prepare.sh#L67-L74
> > >
> > > We can merge "JavaScript Releases" document into our release
> > > document:
> > >
> https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-JavaScriptReleases
> > >
> > >
> > > Thanks,
> > > --
> > > kou
> > >
> > > In  >
> > >   "Re: Timeline for 0.13 Arrow release" on Mon, 18 Mar 2019 20:51:12
> -0500,
> > >   Wes McKinney  wrote:
> > >
> > >> hi folks,
> > >>
> > >> I think we're basically at the 0.13 end game here. There's some more
> > >> patches can get in, but do we all think we can cut an RC by the end of
> > >> the week? What are the blocking issues?
> > >>
> > >> Thanks
> > >> Wes
> > >>
> > >> On Sat, Mar 16, 2019 at 9:57 PM Kouhei Sutou 
> wrote:
> > >>>
> > >>> Hi,
> > >>>
> >  Submitted the packaging builds:
> > 
> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452
> > >>>
> > >>> I've fixed .deb/.rpm packages:
> https://github.com/apache/arrow/pull/3934
> > >>> It has been merged.
> > >>> So .deb/.rpm packages are ready for release.
> > >>>
> > >>> Thanks,
> > >>> --
> > >>> kou
> > >>>
> > >>> In <
> cahm19a5somzxgcphc6ee-mr2usvvhwb252udgjrvocq-cb2...@mail.gmail.com>
> > >>>   "Re: Timeline for 0.13 Arrow release" on Thu, 14 Mar 2019 16:24:43
> +0100,
> > >>>   Krisztián Szűcs  wrote:
> > >>>
> >  Submitted the packaging builds:
> > 
> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452
> > 
> >  On Thu, Mar 14, 2019 at 4:19 PM Wes McKinney 
> wrote:
> > 
> > > The CMake refactor is merged! Kudos to Uwe for 3+ weeks of hard
> labor on
> > > this.
> > >
> > > We should run all the packaging tasks and get a full accounting of
> > > what is broken so we aren't surprised during the release process
> > >
> > > On Wed, Mar 13, 2019 at 9:39 AM Krisztián Szűcs
> > >  wrote:
> > >>
> > >> The proof of the pudding is in the eating. You convinced me.
> > >>
> > >> On Wed, Mar 13, 2019 at 3:31 PM Wes McKinney  >
> > > wrote:
> > >>
> > >>> Krisztian -- are you all right with proceeding with merging the
> CMake
> > >>> refactor? I'm pretty committed to helping fix the problems that
> come
> > >>> up. Since most consumers of the project don't test until _after_
> a
> > >>> release, we won't find out about some problems until we merge it
> and
> > >>> release it. Thus, IMHO it doesn't make sense to wait another 8-10
> > >>> weeks since we'd be delaying feedback for that long. There are
> also a
> > >>> number of follow-on issues blocking on the refactor
> > >>>
> > >>> On Tue, Mar 12, 2019 at 11:39 AM Andy Grove <
> andygrov...@gmail.com>
> > > wrote:
> > 
> >  I've cleaned up my issues for Rust, moving most of them to
> 0.14.0.
> > 
> >  I have two PRs in progress that I would appreciate reviews on:
> > 
> >  https://github.com/apache/arrow/pull/3671 - [Rust] Table API
> (a.k.a
> >  DataFrame)
> > 
> >  https://github.com/apache/arrow/pull/3851 - [Rust] Parquet data
> > > source
> > >>> in
> >  DataFusion
> > 
> >  Once these are merged I have some small follow up PRs for 0.13.0
> > > that I
> > >>> can
> >  get d

Re: Timeline for 0.13 Arrow release

2019-03-26 Thread Wes McKinney
I think it's OK to keep merging patches (within reason) until an RC is
cut so long as the build isn't broken.

It looks like we have 3 issues marked still for 0.13

ARROW-4645: Gandiva in wheels
ARROW-4646: Gandiva in conda packages
ARROW-4995: Windows build support for R

If 4645/4646 are not merge-ready by end of day today I think this work
can be pushed into 0.14 since Gandiva is not blocking much user-facing
functionality at the moment

ARROW-4995 should be merged to give the R folks a fighting chance at a
CRAN submission after the release. I'll take a look at that now

Anything else that must go in?

- Wes

On Tue, Mar 26, 2019 at 8:58 AM Antoine Pitrou  wrote:
>
>
> Hi Kou,
>
> What should be the policy for merges until 0.13.0 is released?
> Do you want to instate a feature freeze or commit freeze?
>
> Regards
>
> Antoine.
>
>
> Le 19/03/2019 à 15:46, Kouhei Sutou a écrit :
> > Hi,
> >
> > There are no blockers on GLib, Ruby and Linux packages.
> >
> > Can we include JavaScript into 0.13.0?
> > If we include JavaScript into 0.13.0, we can remove
> > codes to release JavaScript separately. For example, we can
> > remove dev/release/js-*. We can enable version update code
> > in dev/release/00-prepare.sh:
> > https://github.com/apache/arrow/blob/master/dev/release/00-prepare.sh#L67-L74
> >
> > We can merge "JavaScript Releases" document into our release
> > document:
> > https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-JavaScriptReleases
> >
> >
> > Thanks,
> > --
> > kou
> >
> > In 
> >   "Re: Timeline for 0.13 Arrow release" on Mon, 18 Mar 2019 20:51:12 -0500,
> >   Wes McKinney  wrote:
> >
> >> hi folks,
> >>
> >> I think we're basically at the 0.13 end game here. There's some more
> >> patches can get in, but do we all think we can cut an RC by the end of
> >> the week? What are the blocking issues?
> >>
> >> Thanks
> >> Wes
> >>
> >> On Sat, Mar 16, 2019 at 9:57 PM Kouhei Sutou  wrote:
> >>>
> >>> Hi,
> >>>
>  Submitted the packaging builds:
>  https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452
> >>>
> >>> I've fixed .deb/.rpm packages: https://github.com/apache/arrow/pull/3934
> >>> It has been merged.
> >>> So .deb/.rpm packages are ready for release.
> >>>
> >>> Thanks,
> >>> --
> >>> kou
> >>>
> >>> In 
> >>>   "Re: Timeline for 0.13 Arrow release" on Thu, 14 Mar 2019 16:24:43 
> >>> +0100,
> >>>   Krisztián Szűcs  wrote:
> >>>
>  Submitted the packaging builds:
>  https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452
> 
>  On Thu, Mar 14, 2019 at 4:19 PM Wes McKinney  wrote:
> 
> > The CMake refactor is merged! Kudos to Uwe for 3+ weeks of hard labor on
> > this.
> >
> > We should run all the packaging tasks and get a full accounting of
> > what is broken so we aren't surprised during the release process
> >
> > On Wed, Mar 13, 2019 at 9:39 AM Krisztián Szűcs
> >  wrote:
> >>
> >> The proof of the pudding is in the eating. You convinced me.
> >>
> >> On Wed, Mar 13, 2019 at 3:31 PM Wes McKinney 
> > wrote:
> >>
> >>> Krisztian -- are you all right with proceeding with merging the CMake
> >>> refactor? I'm pretty committed to helping fix the problems that come
> >>> up. Since most consumers of the project don't test until _after_ a
> >>> release, we won't find out about some problems until we merge it and
> >>> release it. Thus, IMHO it doesn't make sense to wait another 8-10
> >>> weeks since we'd be delaying feedback for that long. There are also a
> >>> number of follow-on issues blocking on the refactor
> >>>
> >>> On Tue, Mar 12, 2019 at 11:39 AM Andy Grove 
> > wrote:
> 
>  I've cleaned up my issues for Rust, moving most of them to 0.14.0.
> 
>  I have two PRs in progress that I would appreciate reviews on:
> 
>  https://github.com/apache/arrow/pull/3671 - [Rust] Table API (a.k.a
>  DataFrame)
> 
>  https://github.com/apache/arrow/pull/3851 - [Rust] Parquet data
> > source
> >>> in
>  DataFusion
> 
>  Once these are merged I have some small follow up PRs for 0.13.0
> > that I
> >>> can
>  get done this week.
> 
>  Thanks,
> 
>  Andy.
> 
> 
>  On Tue, Mar 12, 2019 at 8:21 AM Wes McKinney 
> >>> wrote:
> 
> > hi folks,
> >
> > I think we are on track to be able to release toward the end of
> > this
> > month. My proposed timeline:
> >
> > * This week (March 11-15): feature/improvement push mostly
> > * Next week (March 18-22): shift to bug fixes, stabilization, empty
> > backlog of feature/improvement JIRAs
> > * Week of March 25: propose release candidate
> >
> > Does this seem reasonable? Th

[jira] [Created] (ARROW-5014) [Java] Fix typos in Flight module

2019-03-26 Thread Bryan Cutler (JIRA)
Bryan Cutler created ARROW-5014:
---

 Summary: [Java] Fix typos in Flight module
 Key: ARROW-5014
 URL: https://issues.apache.org/jira/browse/ARROW-5014
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Java
Reporter: Bryan Cutler
Assignee: Bryan Cutler






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5013) [Rust] [DataFusion] Refactor runtime expression support

2019-03-26 Thread Andy Grove (JIRA)
Andy Grove created ARROW-5013:
-

 Summary: [Rust] [DataFusion] Refactor runtime expression support
 Key: ARROW-5013
 URL: https://issues.apache.org/jira/browse/ARROW-5013
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust, Rust - DataFusion
Affects Versions: 0.13.0
Reporter: Andy Grove
Assignee: Andy Grove
 Fix For: 0.14.0


Refactor the runtime/compiled expression support to fix tech debt and prepare 
for implementing COUNT



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5012) [C++] "testing" headers not installed

2019-03-26 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-5012:
-

 Summary: [C++] "testing" headers not installed
 Key: ARROW-5012
 URL: https://issues.apache.org/jira/browse/ARROW-5012
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Antoine Pitrou
 Fix For: 0.13.0


The {{src/arrow/testing}} headers should be installed along the rest.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Timeline for 0.13 Arrow release

2019-03-26 Thread Antoine Pitrou


Hi Kou,

What should be the policy for merges until 0.13.0 is released?
Do you want to instate a feature freeze or commit freeze?

Regards

Antoine.


Le 19/03/2019 à 15:46, Kouhei Sutou a écrit :
> Hi,
> 
> There are no blockers on GLib, Ruby and Linux packages.
> 
> Can we include JavaScript into 0.13.0?
> If we include JavaScript into 0.13.0, we can remove
> codes to release JavaScript separately. For example, we can
> remove dev/release/js-*. We can enable version update code
> in dev/release/00-prepare.sh:
> https://github.com/apache/arrow/blob/master/dev/release/00-prepare.sh#L67-L74
> 
> We can merge "JavaScript Releases" document into our release
> document:
> https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-JavaScriptReleases
> 
> 
> Thanks,
> --
> kou
> 
> In 
>   "Re: Timeline for 0.13 Arrow release" on Mon, 18 Mar 2019 20:51:12 -0500,
>   Wes McKinney  wrote:
> 
>> hi folks,
>>
>> I think we're basically at the 0.13 end game here. There's some more
>> patches can get in, but do we all think we can cut an RC by the end of
>> the week? What are the blocking issues?
>>
>> Thanks
>> Wes
>>
>> On Sat, Mar 16, 2019 at 9:57 PM Kouhei Sutou  wrote:
>>>
>>> Hi,
>>>
 Submitted the packaging builds:
 https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452
>>>
>>> I've fixed .deb/.rpm packages: https://github.com/apache/arrow/pull/3934
>>> It has been merged.
>>> So .deb/.rpm packages are ready for release.
>>>
>>> Thanks,
>>> --
>>> kou
>>>
>>> In 
>>>   "Re: Timeline for 0.13 Arrow release" on Thu, 14 Mar 2019 16:24:43 +0100,
>>>   Krisztián Szűcs  wrote:
>>>
 Submitted the packaging builds:
 https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93&query=build-452

 On Thu, Mar 14, 2019 at 4:19 PM Wes McKinney  wrote:

> The CMake refactor is merged! Kudos to Uwe for 3+ weeks of hard labor on
> this.
>
> We should run all the packaging tasks and get a full accounting of
> what is broken so we aren't surprised during the release process
>
> On Wed, Mar 13, 2019 at 9:39 AM Krisztián Szűcs
>  wrote:
>>
>> The proof of the pudding is in the eating. You convinced me.
>>
>> On Wed, Mar 13, 2019 at 3:31 PM Wes McKinney 
> wrote:
>>
>>> Krisztian -- are you all right with proceeding with merging the CMake
>>> refactor? I'm pretty committed to helping fix the problems that come
>>> up. Since most consumers of the project don't test until _after_ a
>>> release, we won't find out about some problems until we merge it and
>>> release it. Thus, IMHO it doesn't make sense to wait another 8-10
>>> weeks since we'd be delaying feedback for that long. There are also a
>>> number of follow-on issues blocking on the refactor
>>>
>>> On Tue, Mar 12, 2019 at 11:39 AM Andy Grove 
> wrote:

 I've cleaned up my issues for Rust, moving most of them to 0.14.0.

 I have two PRs in progress that I would appreciate reviews on:

 https://github.com/apache/arrow/pull/3671 - [Rust] Table API (a.k.a
 DataFrame)

 https://github.com/apache/arrow/pull/3851 - [Rust] Parquet data
> source
>>> in
 DataFusion

 Once these are merged I have some small follow up PRs for 0.13.0
> that I
>>> can
 get done this week.

 Thanks,

 Andy.


 On Tue, Mar 12, 2019 at 8:21 AM Wes McKinney 
>>> wrote:

> hi folks,
>
> I think we are on track to be able to release toward the end of
> this
> month. My proposed timeline:
>
> * This week (March 11-15): feature/improvement push mostly
> * Next week (March 18-22): shift to bug fixes, stabilization, empty
> backlog of feature/improvement JIRAs
> * Week of March 25: propose release candidate
>
> Does this seem reasonable? This puts us at about 9-10 weeks from
> 0.12.
>
> We need an RM for 0.13, any PMCs want to volunteer?
>
> Take a look at our release page:
>
>
>>>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103091219
>
> Out of the open or in-progress issues, we have:
>
> * C#: 3 issues
> * C++ (all components): 51 issues
> * Java: 3 issues
> * Python: 38 issues
> * Rust (all components): 33 issues
>
> Please help curating the backlogs for each component. There's a
> smattering of issues in other categories. There are also 10 open
> issues with No Component (and 20 resolved issues), those need their
> metadata fixed.
>
> Thanks,
> Wes
>
> On Wed, Feb 27, 2019 at 1:49 PM Wes McKinney 
>>> wrote:
>>

[jira] [Created] (ARROW-5011) [Release] Add support in the source release script for custom hash

2019-03-26 Thread Francois Saint-Jacques (JIRA)
Francois Saint-Jacques created ARROW-5011:
-

 Summary: [Release] Add support in the source release script for 
custom hash
 Key: ARROW-5011
 URL: https://issues.apache.org/jira/browse/ARROW-5011
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: Francois Saint-Jacques
 Fix For: 0.13.0


This is a minor feature to help debugging said script on a by overriding the 
git-archive hash instead of the hash inferred from the release tag.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5010) [Release] Fix release script with llvm-7

2019-03-26 Thread Francois Saint-Jacques (JIRA)
Francois Saint-Jacques created ARROW-5010:
-

 Summary: [Release] Fix release script with llvm-7
 Key: ARROW-5010
 URL: https://issues.apache.org/jira/browse/ARROW-5010
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Francois Saint-Jacques
Assignee: Francois Saint-Jacques


Source release script fails to compile gandiva because it requires llvm-7 and 
only llvm-6 is available in the ubuntu18 docker image.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)