[jira] [Created] (ARROW-4968) [Rust] StructArray builder and From<> methods should check that field types match schema

2019-03-19 Thread Neville Dipale (JIRA)
Neville Dipale created ARROW-4968:
-

 Summary: [Rust] StructArray builder and From<> methods should 
check that field types match schema
 Key: ARROW-4968
 URL: https://issues.apache.org/jira/browse/ARROW-4968
 Project: Apache Arrow
  Issue Type: Bug
  Components: Rust
Affects Versions: 0.13.0
Reporter: Neville Dipale


Similar to how we assert that array data types are equal to their field types, 
we should do the same for StructArray and StructBuilder where necessary



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Timeline for 0.13 Arrow release

2019-03-19 Thread Paul Taylor
I agree, the JS has matured a lot in the last few months. I think it's 
ready to join the regular Arrow releases. Let me know if I can help 
integrate the publish scripts :-)


The two main things in progress are docs + Vector Builders, neither of 
which should block this release.


We're going to try to get the docs/recipes ready for a PR this weekend. 
If that lands shortly after 0.13.0 goes out, would it be possible to 
update the website independently, or would that need to wait until 0.14?


Paul

On 3/19/19 10:08 AM, Wes McKinney wrote:

I'm in favor of including JS in the 0.13.0 release.

I'm going to try to fix a couple of the Python Parquet bugs until the
RC is ready to be cut, but none of them need block the release.

Seems like we need someone else to volunteer to be the RM for 0.13 if
Uwe is unavailable next week. Antoine -- are you possibly up for it
(the initial setup will be a bit painful)? I don't have access to a
machine with my code signing key on it until next week so I cannot do
it

- Wes

On Tue, Mar 19, 2019 at 9:46 AM Kouhei Sutou  wrote:

Hi,

There are no blockers on GLib, Ruby and Linux packages.

Can we include JavaScript into 0.13.0?
If we include JavaScript into 0.13.0, we can remove
codes to release JavaScript separately. For example, we can
remove dev/release/js-*. We can enable version update code
in dev/release/00-prepare.sh:
https://github.com/apache/arrow/blob/master/dev/release/00-prepare.sh#L67-L74

We can merge "JavaScript Releases" document into our release
document:
https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-JavaScriptReleases


Thanks,
--
kou

In 
   "Re: Timeline for 0.13 Arrow release" on Mon, 18 Mar 2019 20:51:12 -0500,
   Wes McKinney  wrote:


hi folks,

I think we're basically at the 0.13 end game here. There's some more
patches can get in, but do we all think we can cut an RC by the end of
the week? What are the blocking issues?

Thanks
Wes

On Sat, Mar 16, 2019 at 9:57 PM Kouhei Sutou  wrote:

Hi,


Submitted the packaging builds:
https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93=build-452

I've fixed .deb/.rpm packages: https://github.com/apache/arrow/pull/3934
It has been merged.
So .deb/.rpm packages are ready for release.

Thanks,
--
kou

In 
   "Re: Timeline for 0.13 Arrow release" on Thu, 14 Mar 2019 16:24:43 +0100,
   Krisztián Szűcs  wrote:


Submitted the packaging builds:
https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93=build-452

On Thu, Mar 14, 2019 at 4:19 PM Wes McKinney  wrote:


The CMake refactor is merged! Kudos to Uwe for 3+ weeks of hard labor on
this.

We should run all the packaging tasks and get a full accounting of
what is broken so we aren't surprised during the release process

On Wed, Mar 13, 2019 at 9:39 AM Krisztián Szűcs
 wrote:

The proof of the pudding is in the eating. You convinced me.

On Wed, Mar 13, 2019 at 3:31 PM Wes McKinney 

wrote:

Krisztian -- are you all right with proceeding with merging the CMake
refactor? I'm pretty committed to helping fix the problems that come
up. Since most consumers of the project don't test until _after_ a
release, we won't find out about some problems until we merge it and
release it. Thus, IMHO it doesn't make sense to wait another 8-10
weeks since we'd be delaying feedback for that long. There are also a
number of follow-on issues blocking on the refactor

On Tue, Mar 12, 2019 at 11:39 AM Andy Grove 

wrote:

I've cleaned up my issues for Rust, moving most of them to 0.14.0.

I have two PRs in progress that I would appreciate reviews on:

https://github.com/apache/arrow/pull/3671 - [Rust] Table API (a.k.a
DataFrame)

https://github.com/apache/arrow/pull/3851 - [Rust] Parquet data

source

in

DataFusion

Once these are merged I have some small follow up PRs for 0.13.0

that I

can

get done this week.

Thanks,

Andy.


On Tue, Mar 12, 2019 at 8:21 AM Wes McKinney 

wrote:

hi folks,

I think we are on track to be able to release toward the end of

this

month. My proposed timeline:

* This week (March 11-15): feature/improvement push mostly
* Next week (March 18-22): shift to bug fixes, stabilization, empty
backlog of feature/improvement JIRAs
* Week of March 25: propose release candidate

Does this seem reasonable? This puts us at about 9-10 weeks from

0.12.

We need an RM for 0.13, any PMCs want to volunteer?

Take a look at our release page:



https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103091219

Out of the open or in-progress issues, we have:

* C#: 3 issues
* C++ (all components): 51 issues
* Java: 3 issues
* Python: 38 issues
* Rust (all components): 33 issues

Please help curating the backlogs for each component. There's a
smattering of issues in other categories. There are also 10 open
issues with No Component (and 20 resolved issues), those need their
metadata fixed.

Thanks,
Wes

On Wed, Feb 27, 2019 at 1:49 PM Wes McKinney 

wrote:

The timeline for the 0.13 release is 

[jira] [Created] (ARROW-4967) Object type and stats lost when using 96-bit timestamps

2019-03-19 Thread Diego Argueta (JIRA)
Diego Argueta created ARROW-4967:


 Summary: Object type and stats lost when using 96-bit timestamps
 Key: ARROW-4967
 URL: https://issues.apache.org/jira/browse/ARROW-4967
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Affects Versions: 0.12.1
 Environment: PyArrow: 0.12.1
Python: 2.7.15, 3.7.2
Pandas: 0.24.2
Reporter: Diego Argueta


Run the following code:

{code:python}
import datetime as dt
import pandas as pd
import pyarrow as pa
import pyarrow.parquet as pq

dataframe = pd.DataFrame({'foo': [dt.datetime.now()]})
table = pa.Table.from_pandas(dataframe, preserve_index=False)

pq.write_table(table, 'int64.parq')
pq.write_table(table, 'int96.parq', use_deprecated_int96_timestamps=True)
{code}

Examining the {{int64.parq}} file, we see that the column metadata includes an 
object type of {{TIMESTAMP_MICROS}} and also gives some stats. All is well.

{code}
file schema: schema 

foo: OPTIONAL INT64 O:TIMESTAMP_MICROS R:0 D:1

row group 1: RC:1 TS:76 OFFSET:4 

foo:  INT64 SNAPPY ... ST:[min: 2019-12-31T23:59:59.999000, max: 
2019-12-31T23:59:59.999000, num_nulls: 0]
{code}


However, if we look at {{int96.parq}}, it appears that that metadata is lost. 
No object type, and no column stats.

{code}
file schema: schema 

foo: OPTIONAL INT96 R:0 D:1

row group 1: RC:1 TS:58 OFFSET:4 

foo:  INT96 SNAPPY ... ST:[no stats for this column]
{code}

This is a bit confusing since the metadata for the exact same data can look 
differently depending on an unrelated flag being set or cleared.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Any formal plans for mxnet/xgboost and arrow collaboration?

2019-03-19 Thread Wes McKinney
hi Jonathan,

Well, one of the good things about Apache projects is that working in
private is frowned upon, so if you don't find anything in JIRA or in
the mailing list archives then the answer is most probably "no" :)

That being said I'm interested in integrations between Apache Arrow
and machine learning frameworks. Generally some serialization is
necessary at some point because machine learning frameworks operation
in many cases on (homogeneously-typed) multidimensional arrays or
sparse matrices

- Wes

On Tue, Mar 19, 2019 at 1:42 PM Jonathan Chiang  wrote:
>
> Hi,
>
> I was curious if there was any concrete roadmap to build compatibility 
> between these Apache frameworks.
>
> Would love to do data processing in python/R in arrow for deep learning or 
> tabular data model building.
>
> My use case is model building and interoperability between R and Python users 
> at Stanford Biomedical Informatics.
>
> My dream would be to have cloud based /local hybrid deep learning and reduce 
> the amount serialization needed between R and Python users.
>
> Thanks,
> Jonathan


Any formal plans for mxnet/xgboost and arrow collaboration?

2019-03-19 Thread Jonathan Chiang
Hi,

I was curious if there was any concrete roadmap to build compatibility between 
these Apache frameworks. 

Would love to do data processing in python/R in arrow for deep learning or 
tabular data model building.

My use case is model building and interoperability between R and Python users 
at Stanford Biomedical Informatics. 

My dream would be to have cloud based /local hybrid deep learning and reduce 
the amount serialization needed between R and Python users. 

Thanks,
Jonathan 

Re: Memory mapped files in Java

2019-03-19 Thread Wes McKinney
hi Razvan,

I think this is dependent on
https://issues.apache.org/jira/browse/ARROW-3191. Once that is
accomplished I think that bridging memory mapped files and ArrowBuf is
relatively straightforward.

- Wes

On Tue, Mar 19, 2019 at 12:13 PM Razvan Chitu  wrote:
>
> Hi,
>
> I was looking for a way to interact with memory mapped Arrow files in Java
> and I found this thread:
> http://mail-archives.apache.org/mod_mbox/arrow-dev/201709.mbox/%3CCAOgX8szfO-F=ccsqcggucqfzqkgu2wy+pihztbv1gkat4eq...@mail.gmail.com%3E
> . Are there any updates on the status of an implementation (or a plan /
> design)?
>
> Best,
> Razvan


[jira] [Created] (ARROW-4966) orc::TimezoneErro Can't open /usr/share/zoneinfo/GMT-00:00

2019-03-19 Thread Peter Wicks (JIRA)
Peter Wicks created ARROW-4966:
--

 Summary: orc::TimezoneErro Can't open /usr/share/zoneinfo/GMT-00:00
 Key: ARROW-4966
 URL: https://issues.apache.org/jira/browse/ARROW-4966
 Project: Apache Arrow
  Issue Type: Bug
  Components: cpp
Affects Versions: 0.12.0
Reporter: Peter Wicks


When reading some ORC files, pyarrow orc throws the following error on 
`read()`: 

`o = pf.read()`
terminate called after throwing an instance of 'orc::TimezoneError'
 what(): Can't open /usr/share/zoneinfo/GMT-00:00

While it's true this folder does not exist, I don't think it normally does. Our 
server has folders for `GMT`, `GMT0`, `GMT-0`, and `GMT+0`.

ORC file was created using HIVE, compressed with Snappy. Other files from the 
same table/partition do not throw this error. Files can be read with Hive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Timeline for 0.13 Arrow release

2019-03-19 Thread Wes McKinney
I'm in favor of including JS in the 0.13.0 release.

I'm going to try to fix a couple of the Python Parquet bugs until the
RC is ready to be cut, but none of them need block the release.

Seems like we need someone else to volunteer to be the RM for 0.13 if
Uwe is unavailable next week. Antoine -- are you possibly up for it
(the initial setup will be a bit painful)? I don't have access to a
machine with my code signing key on it until next week so I cannot do
it

- Wes

On Tue, Mar 19, 2019 at 9:46 AM Kouhei Sutou  wrote:
>
> Hi,
>
> There are no blockers on GLib, Ruby and Linux packages.
>
> Can we include JavaScript into 0.13.0?
> If we include JavaScript into 0.13.0, we can remove
> codes to release JavaScript separately. For example, we can
> remove dev/release/js-*. We can enable version update code
> in dev/release/00-prepare.sh:
> https://github.com/apache/arrow/blob/master/dev/release/00-prepare.sh#L67-L74
>
> We can merge "JavaScript Releases" document into our release
> document:
> https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-JavaScriptReleases
>
>
> Thanks,
> --
> kou
>
> In 
>   "Re: Timeline for 0.13 Arrow release" on Mon, 18 Mar 2019 20:51:12 -0500,
>   Wes McKinney  wrote:
>
> > hi folks,
> >
> > I think we're basically at the 0.13 end game here. There's some more
> > patches can get in, but do we all think we can cut an RC by the end of
> > the week? What are the blocking issues?
> >
> > Thanks
> > Wes
> >
> > On Sat, Mar 16, 2019 at 9:57 PM Kouhei Sutou  wrote:
> >>
> >> Hi,
> >>
> >> > Submitted the packaging builds:
> >> > https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93=build-452
> >>
> >> I've fixed .deb/.rpm packages: https://github.com/apache/arrow/pull/3934
> >> It has been merged.
> >> So .deb/.rpm packages are ready for release.
> >>
> >> Thanks,
> >> --
> >> kou
> >>
> >> In 
> >>   "Re: Timeline for 0.13 Arrow release" on Thu, 14 Mar 2019 16:24:43 +0100,
> >>   Krisztián Szűcs  wrote:
> >>
> >> > Submitted the packaging builds:
> >> > https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93=build-452
> >> >
> >> > On Thu, Mar 14, 2019 at 4:19 PM Wes McKinney  wrote:
> >> >
> >> >> The CMake refactor is merged! Kudos to Uwe for 3+ weeks of hard labor on
> >> >> this.
> >> >>
> >> >> We should run all the packaging tasks and get a full accounting of
> >> >> what is broken so we aren't surprised during the release process
> >> >>
> >> >> On Wed, Mar 13, 2019 at 9:39 AM Krisztián Szűcs
> >> >>  wrote:
> >> >> >
> >> >> > The proof of the pudding is in the eating. You convinced me.
> >> >> >
> >> >> > On Wed, Mar 13, 2019 at 3:31 PM Wes McKinney 
> >> >> wrote:
> >> >> >
> >> >> > > Krisztian -- are you all right with proceeding with merging the 
> >> >> > > CMake
> >> >> > > refactor? I'm pretty committed to helping fix the problems that come
> >> >> > > up. Since most consumers of the project don't test until _after_ a
> >> >> > > release, we won't find out about some problems until we merge it and
> >> >> > > release it. Thus, IMHO it doesn't make sense to wait another 8-10
> >> >> > > weeks since we'd be delaying feedback for that long. There are also 
> >> >> > > a
> >> >> > > number of follow-on issues blocking on the refactor
> >> >> > >
> >> >> > > On Tue, Mar 12, 2019 at 11:39 AM Andy Grove 
> >> >> wrote:
> >> >> > > >
> >> >> > > > I've cleaned up my issues for Rust, moving most of them to 0.14.0.
> >> >> > > >
> >> >> > > > I have two PRs in progress that I would appreciate reviews on:
> >> >> > > >
> >> >> > > > https://github.com/apache/arrow/pull/3671 - [Rust] Table API 
> >> >> > > > (a.k.a
> >> >> > > > DataFrame)
> >> >> > > >
> >> >> > > > https://github.com/apache/arrow/pull/3851 - [Rust] Parquet data
> >> >> source
> >> >> > > in
> >> >> > > > DataFusion
> >> >> > > >
> >> >> > > > Once these are merged I have some small follow up PRs for 0.13.0
> >> >> that I
> >> >> > > can
> >> >> > > > get done this week.
> >> >> > > >
> >> >> > > > Thanks,
> >> >> > > >
> >> >> > > > Andy.
> >> >> > > >
> >> >> > > >
> >> >> > > > On Tue, Mar 12, 2019 at 8:21 AM Wes McKinney 
> >> >> > > wrote:
> >> >> > > >
> >> >> > > > > hi folks,
> >> >> > > > >
> >> >> > > > > I think we are on track to be able to release toward the end of
> >> >> this
> >> >> > > > > month. My proposed timeline:
> >> >> > > > >
> >> >> > > > > * This week (March 11-15): feature/improvement push mostly
> >> >> > > > > * Next week (March 18-22): shift to bug fixes, stabilization, 
> >> >> > > > > empty
> >> >> > > > > backlog of feature/improvement JIRAs
> >> >> > > > > * Week of March 25: propose release candidate
> >> >> > > > >
> >> >> > > > > Does this seem reasonable? This puts us at about 9-10 weeks from
> >> >> 0.12.
> >> >> > > > >
> >> >> > > > > We need an RM for 0.13, any PMCs want to volunteer?
> >> >> > > > >
> >> >> > > > > Take a look at our release page:
> >> >> > > > >
> >> >> > > > >
> >> >> > >
> >> >> 

Memory mapped files in Java

2019-03-19 Thread Razvan Chitu
Hi,

I was looking for a way to interact with memory mapped Arrow files in Java
and I found this thread:
http://mail-archives.apache.org/mod_mbox/arrow-dev/201709.mbox/%3CCAOgX8szfO-F=ccsqcggucqfzqkgu2wy+pihztbv1gkat4eq...@mail.gmail.com%3E
. Are there any updates on the status of an implementation (or a plan /
design)?

Best,
Razvan


[jira] [Created] (ARROW-4965) [Python] Timestamp array type detection should use tzname of datetime.datetime objects

2019-03-19 Thread Tim Swast (JIRA)
Tim Swast created ARROW-4965:


 Summary: [Python] Timestamp array type detection should use tzname 
of datetime.datetime objects
 Key: ARROW-4965
 URL: https://issues.apache.org/jira/browse/ARROW-4965
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
 Environment: $ python --version
Python 3.7.2

$ pip freeze
numpy==1.16.2
pyarrow==0.12.1
pytz==2018.9
six==1.12.0

$ sw_vers
ProductName:Mac OS X
ProductVersion: 10.14.3
BuildVersion:   18D109
(pyarrow) 
Reporter: Tim Swast


The type detection from datetime objects to array appears to ignore the 
presence of a tzinfo on the datetime object, instead storing them as naive 
timestamp columns.

Python code:

{code:python}
import datetime
import pytz
import pyarrow as pa

naive_datetime = datetime.datetime(2019, 1, 13, 12, 11, 10)
utc_datetime = datetime.datetime(2019, 1, 13, 12, 11, 10, tzinfo=pytz.utc)
tzaware_datetime = utc_datetime.astimezone(pytz.timezone('America/Los_Angeles'))

def inspect(varname):
print(varname)
arr = globals()[varname]
print(arr.type)
print(arr)
print()

auto_naive_arr = pa.array([naive_datetime])
inspect("auto_naive_arr")

auto_utc_arr = pa.array([utc_datetime])
inspect("auto_utc_arr")

auto_tzaware_arr = pa.array([tzaware_datetime])
inspect("auto_tzaware_arr")

auto_mixed_arr = pa.array([utc_datetime, tzaware_datetime])
inspect("auto_mixed_arr")

naive_type = pa.timestamp("us", naive_datetime.tzname())
utc_type = pa.timestamp("us", utc_datetime.tzname())
tzaware_type = pa.timestamp("us", tzaware_datetime.tzname())

naive_arr = pa.array([naive_datetime], type=naive_type)
inspect("naive_arr")

utc_arr = pa.array([utc_datetime], type=utc_type)
inspect("utc_arr")

tzaware_arr = pa.array([tzaware_datetime], type=tzaware_type)
inspect("tzaware_arr")

mixed_arr = pa.array([utc_datetime, tzaware_datetime], type=utc_type)
inspect("mixed_arr")
{code}

This prints:

{noformat}
$ python detect_timezone.py
auto_naive_arr
timestamp[us]
[
  154738147000
]

auto_utc_arr
timestamp[us]
[
  154738147000
]

auto_tzaware_arr
timestamp[us]
[
  154735267000
]

auto_mixed_arr
timestamp[us]
[
  154738147000,
  154735267000
]

naive_arr
timestamp[us]
[
  154738147000
]

utc_arr
timestamp[us, tz=UTC]
[
  154738147000
]

tzaware_arr
timestamp[us, tz=PST]
[
  154735267000
]

mixed_arr
timestamp[us, tz=UTC]
[
  154738147000,
  154735267000
]
{noformat}

But I would expect the following types instead:

* {{naive_datetime}}: {{timestamp[us]}}
* {{auto_utc_arr}}: {{timestamp[us, tz=UTC]}}
* {{auto_tzaware_arr}}: {{timestamp[us, tz=PST]}} (Or maybe 
{{tz='America/Los_Angeles'}}. I'm not sure why {{pytz}} returns {{PST}} as the 
{{tzname}})
* {{auto_mixed_arr}}: {{timestamp[us, tz=UTC]}}

Also, in the "mixed" case, I'd expect the actual stored microseconds to be the 
same for both rows, since {{utc_datetime}} and {{tzaware_datetime}} both refer 
to the same point in time. It seems reasonable for any naive datetime objects 
mixed in with tz-aware datetimes to be interpreted as UTC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Timeline for 0.13 Arrow release

2019-03-19 Thread Kouhei Sutou
Hi,

There are no blockers on GLib, Ruby and Linux packages.

Can we include JavaScript into 0.13.0?
If we include JavaScript into 0.13.0, we can remove
codes to release JavaScript separately. For example, we can
remove dev/release/js-*. We can enable version update code
in dev/release/00-prepare.sh:
https://github.com/apache/arrow/blob/master/dev/release/00-prepare.sh#L67-L74

We can merge "JavaScript Releases" document into our release
document:
https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide#ReleaseManagementGuide-JavaScriptReleases


Thanks,
--
kou

In 
  "Re: Timeline for 0.13 Arrow release" on Mon, 18 Mar 2019 20:51:12 -0500,
  Wes McKinney  wrote:

> hi folks,
> 
> I think we're basically at the 0.13 end game here. There's some more
> patches can get in, but do we all think we can cut an RC by the end of
> the week? What are the blocking issues?
> 
> Thanks
> Wes
> 
> On Sat, Mar 16, 2019 at 9:57 PM Kouhei Sutou  wrote:
>>
>> Hi,
>>
>> > Submitted the packaging builds:
>> > https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93=build-452
>>
>> I've fixed .deb/.rpm packages: https://github.com/apache/arrow/pull/3934
>> It has been merged.
>> So .deb/.rpm packages are ready for release.
>>
>> Thanks,
>> --
>> kou
>>
>> In 
>>   "Re: Timeline for 0.13 Arrow release" on Thu, 14 Mar 2019 16:24:43 +0100,
>>   Krisztián Szűcs  wrote:
>>
>> > Submitted the packaging builds:
>> > https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93=build-452
>> >
>> > On Thu, Mar 14, 2019 at 4:19 PM Wes McKinney  wrote:
>> >
>> >> The CMake refactor is merged! Kudos to Uwe for 3+ weeks of hard labor on
>> >> this.
>> >>
>> >> We should run all the packaging tasks and get a full accounting of
>> >> what is broken so we aren't surprised during the release process
>> >>
>> >> On Wed, Mar 13, 2019 at 9:39 AM Krisztián Szűcs
>> >>  wrote:
>> >> >
>> >> > The proof of the pudding is in the eating. You convinced me.
>> >> >
>> >> > On Wed, Mar 13, 2019 at 3:31 PM Wes McKinney 
>> >> wrote:
>> >> >
>> >> > > Krisztian -- are you all right with proceeding with merging the CMake
>> >> > > refactor? I'm pretty committed to helping fix the problems that come
>> >> > > up. Since most consumers of the project don't test until _after_ a
>> >> > > release, we won't find out about some problems until we merge it and
>> >> > > release it. Thus, IMHO it doesn't make sense to wait another 8-10
>> >> > > weeks since we'd be delaying feedback for that long. There are also a
>> >> > > number of follow-on issues blocking on the refactor
>> >> > >
>> >> > > On Tue, Mar 12, 2019 at 11:39 AM Andy Grove 
>> >> wrote:
>> >> > > >
>> >> > > > I've cleaned up my issues for Rust, moving most of them to 0.14.0.
>> >> > > >
>> >> > > > I have two PRs in progress that I would appreciate reviews on:
>> >> > > >
>> >> > > > https://github.com/apache/arrow/pull/3671 - [Rust] Table API (a.k.a
>> >> > > > DataFrame)
>> >> > > >
>> >> > > > https://github.com/apache/arrow/pull/3851 - [Rust] Parquet data
>> >> source
>> >> > > in
>> >> > > > DataFusion
>> >> > > >
>> >> > > > Once these are merged I have some small follow up PRs for 0.13.0
>> >> that I
>> >> > > can
>> >> > > > get done this week.
>> >> > > >
>> >> > > > Thanks,
>> >> > > >
>> >> > > > Andy.
>> >> > > >
>> >> > > >
>> >> > > > On Tue, Mar 12, 2019 at 8:21 AM Wes McKinney 
>> >> > > wrote:
>> >> > > >
>> >> > > > > hi folks,
>> >> > > > >
>> >> > > > > I think we are on track to be able to release toward the end of
>> >> this
>> >> > > > > month. My proposed timeline:
>> >> > > > >
>> >> > > > > * This week (March 11-15): feature/improvement push mostly
>> >> > > > > * Next week (March 18-22): shift to bug fixes, stabilization, 
>> >> > > > > empty
>> >> > > > > backlog of feature/improvement JIRAs
>> >> > > > > * Week of March 25: propose release candidate
>> >> > > > >
>> >> > > > > Does this seem reasonable? This puts us at about 9-10 weeks from
>> >> 0.12.
>> >> > > > >
>> >> > > > > We need an RM for 0.13, any PMCs want to volunteer?
>> >> > > > >
>> >> > > > > Take a look at our release page:
>> >> > > > >
>> >> > > > >
>> >> > >
>> >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103091219
>> >> > > > >
>> >> > > > > Out of the open or in-progress issues, we have:
>> >> > > > >
>> >> > > > > * C#: 3 issues
>> >> > > > > * C++ (all components): 51 issues
>> >> > > > > * Java: 3 issues
>> >> > > > > * Python: 38 issues
>> >> > > > > * Rust (all components): 33 issues
>> >> > > > >
>> >> > > > > Please help curating the backlogs for each component. There's a
>> >> > > > > smattering of issues in other categories. There are also 10 open
>> >> > > > > issues with No Component (and 20 resolved issues), those need 
>> >> > > > > their
>> >> > > > > metadata fixed.
>> >> > > > >
>> >> > > > > Thanks,
>> >> > > > > Wes
>> >> > > > >
>> >> > > > > On Wed, Feb 27, 2019 at 1:49 PM Wes McKinney 
>> >> > > wrote:
>> >> > > > > >
>> >> > > > > 

[jira] [Created] (ARROW-4964) [Ruby] Add clkosed check if available on auto close

2019-03-19 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-4964:
---

 Summary: [Ruby] Add clkosed check if available on auto close
 Key: ARROW-4964
 URL: https://issues.apache.org/jira/browse/ARROW-4964
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Ruby
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Timeline for 0.13 Arrow release

2019-03-19 Thread Uwe L. Korn
Hello,

https://issues.apache.org/jira/browse/ARROW-3578 is something that needs to be 
fixed upstream in RAT. The only thing for Arrow is then to update the RAT 
version.

With the RC date at the end of the week, I sadly cannot be release manager as 
I'm not sure about my availability from next week on.

Uwe

On Tue, Mar 19, 2019, at 11:59 AM, Antoine Pitrou wrote:
> 
> I'd suggest to concentrate on bug fixes instead (especially as there
> seems to be a Rust blocker).  New features can wait for 0.14.
> 
> Regards
> 
> Antoine.
> 
> 
> Le 19/03/2019 à 11:50, Neville Dipale a écrit :
> > When is the cut-off for PRs? We have a public holiday on Thursday, and I
> > want to use that to finish off my work on array casting.
> > 
> > If that'll be too late I can defer to the next release.
> > 
> > On Tue, 19 Mar 2019, 12:34 Antoine Pitrou,  wrote:
> > 
> >>
> >> The only potential blocker from my POV is
> >> https://issues.apache.org/jira/browse/ARROW-3578, but we've already
> >> lived with it for previous releases, so perhaps it's ok anyway?
> >>
> >> Regards
> >>
> >> Antoine.
> >>
> >>
> >> Le 19/03/2019 à 02:51, Wes McKinney a écrit :
> >>> hi folks,
> >>>
> >>> I think we're basically at the 0.13 end game here. There's some more
> >>> patches can get in, but do we all think we can cut an RC by the end of
> >>> the week? What are the blocking issues?
> >>>
> >>> Thanks
> >>> Wes
> >>>
> >>> On Sat, Mar 16, 2019 at 9:57 PM Kouhei Sutou  wrote:
> 
>  Hi,
> 
> > Submitted the packaging builds:
> >
> >> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93=build-452
> 
>  I've fixed .deb/.rpm packages:
> >> https://github.com/apache/arrow/pull/3934
>  It has been merged.
>  So .deb/.rpm packages are ready for release.
> 
>  Thanks,
>  --
>  kou
> 
>  In 
>    "Re: Timeline for 0.13 Arrow release" on Thu, 14 Mar 2019 16:24:43
> >> +0100,
>    Krisztián Szűcs  wrote:
> 
> > Submitted the packaging builds:
> >
> >> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93=build-452
> >
> > On Thu, Mar 14, 2019 at 4:19 PM Wes McKinney 
> >> wrote:
> >
> >> The CMake refactor is merged! Kudos to Uwe for 3+ weeks of hard labor
> >> on
> >> this.
> >>
> >> We should run all the packaging tasks and get a full accounting of
> >> what is broken so we aren't surprised during the release process
> >>
> >> On Wed, Mar 13, 2019 at 9:39 AM Krisztián Szűcs
> >>  wrote:
> >>>
> >>> The proof of the pudding is in the eating. You convinced me.
> >>>
> >>> On Wed, Mar 13, 2019 at 3:31 PM Wes McKinney 
> >> wrote:
> >>>
>  Krisztian -- are you all right with proceeding with merging the
> >> CMake
>  refactor? I'm pretty committed to helping fix the problems that come
>  up. Since most consumers of the project don't test until _after_ a
>  release, we won't find out about some problems until we merge it and
>  release it. Thus, IMHO it doesn't make sense to wait another 8-10
>  weeks since we'd be delaying feedback for that long. There are also
> >> a
>  number of follow-on issues blocking on the refactor
> 
>  On Tue, Mar 12, 2019 at 11:39 AM Andy Grove 
> >> wrote:
> >
> > I've cleaned up my issues for Rust, moving most of them to 0.14.0.
> >
> > I have two PRs in progress that I would appreciate reviews on:
> >
> > https://github.com/apache/arrow/pull/3671 - [Rust] Table API
> >> (a.k.a
> > DataFrame)
> >
> > https://github.com/apache/arrow/pull/3851 - [Rust] Parquet data
> >> source
>  in
> > DataFusion
> >
> > Once these are merged I have some small follow up PRs for 0.13.0
> >> that I
>  can
> > get done this week.
> >
> > Thanks,
> >
> > Andy.
> >
> >
> > On Tue, Mar 12, 2019 at 8:21 AM Wes McKinney 
>  wrote:
> >
> >> hi folks,
> >>
> >> I think we are on track to be able to release toward the end of
> >> this
> >> month. My proposed timeline:
> >>
> >> * This week (March 11-15): feature/improvement push mostly
> >> * Next week (March 18-22): shift to bug fixes, stabilization,
> >> empty
> >> backlog of feature/improvement JIRAs
> >> * Week of March 25: propose release candidate
> >>
> >> Does this seem reasonable? This puts us at about 9-10 weeks from
> >> 0.12.
> >>
> >> We need an RM for 0.13, any PMCs want to volunteer?
> >>
> >> Take a look at our release page:
> >>
> >>
> 
> >>
> >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103091219
> >>
> >> Out of the open or in-progress issues, we have:
> 

[jira] [Created] (ARROW-4963) [C++] MSVC build invokes CMake repeatedly

2019-03-19 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-4963:
---

 Summary: [C++] MSVC build invokes CMake repeatedly
 Key: ARROW-4963
 URL: https://issues.apache.org/jira/browse/ARROW-4963
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Wes McKinney
 Fix For: 0.14.0


I'm doing a pretty vanilla out of source build with Visual Studio 2015 and I am 
finding that it's re-running CMake many times throughout the build. I will try 
to produce a complete log when I can to illustrate. I am using this command:

{code}
   cmake -G "Visual Studio 14 2015 Win64" ^
 -DCMAKE_INSTALL_PREFIX=%ARROW_HOME% ^
 -DARROW_CXXFLAGS="/WX /MP" ^
 -DARROW_GANDIVA=on ^
 -DARROW_ORC=on ^
 -DARROW_PARQUET=on ^
 -DARROW_PYTHON=on ..
   cmake --build . --target INSTALL --config Release
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4962) [C++] Warning level to CHECKIN can't compile on modern GCC

2019-03-19 Thread Francois Saint-Jacques (JIRA)
Francois Saint-Jacques created ARROW-4962:
-

 Summary: [C++] Warning level to CHECKIN can't compile on modern GCC
 Key: ARROW-4962
 URL: https://issues.apache.org/jira/browse/ARROW-4962
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Affects Versions: 0.12.1
Reporter: Francois Saint-Jacques
Assignee: Francois Saint-Jacques
 Fix For: 0.13.0


This is somewhat related to the recent DCHECK change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4961) [C++][Python] Add GTest_SOURCE=BUNDLED to relevant build docs that use toolchain

2019-03-19 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-4961:
---

 Summary: [C++][Python] Add GTest_SOURCE=BUNDLED to relevant build 
docs that use toolchain
 Key: ARROW-4961
 URL: https://issues.apache.org/jira/browse/ARROW-4961
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Wes McKinney
Assignee: Wes McKinney
 Fix For: 0.13.0


The conda-forge gtest packages don't work for me on Windows. In the meantime, 
it is necessary to use BUNDLED method as we are currently already doing in our 
Appveyor builds, so we should update the documentation so others don't hit this 
rough edge

{code}
util-internal-test.obj : error LNK2001: unresolved external symbol "class 
testing::internal::Mutex testing::internal::g_gmock_mutex" 
(?g_gmock_mutex@internal@testing@@3VMutex@12@A) [C:\Users\wesmc\code\arrow\cp 
p\build\src\arrow\compute\kernels\arrow-compute-util-internal-test.vcxproj]
util-internal-test.obj : error LNK2001: unresolved external symbol "class 
testing::internal::ThreadLocal 
testing::internal::g_gmock_implicit_sequence" (?g_gmock_implicit_sequence@inte 
rnal@testing@@3V?$ThreadLocal@PEAVSequence@testing@@@12@A) 
[C:\Users\wesmc\code\arrow\cpp\build\src\arrow\compute\kernels\arrow-compute-util-internal-test.vcxproj]
C:\Users\wesmc\code\arrow\cpp\build\release\Release\arrow-compute-util-internal-test.exe
 : fatal error LNK1120: 2 unresolved externals 
[C:\Users\wesmc\code\arrow\cpp\build\src\arrow\compute\kernels\arrow-comput 
e-util-internal-test.vcxproj]
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Timeline for 0.13 Arrow release

2019-03-19 Thread Antoine Pitrou


I'd suggest to concentrate on bug fixes instead (especially as there
seems to be a Rust blocker).  New features can wait for 0.14.

Regards

Antoine.


Le 19/03/2019 à 11:50, Neville Dipale a écrit :
> When is the cut-off for PRs? We have a public holiday on Thursday, and I
> want to use that to finish off my work on array casting.
> 
> If that'll be too late I can defer to the next release.
> 
> On Tue, 19 Mar 2019, 12:34 Antoine Pitrou,  wrote:
> 
>>
>> The only potential blocker from my POV is
>> https://issues.apache.org/jira/browse/ARROW-3578, but we've already
>> lived with it for previous releases, so perhaps it's ok anyway?
>>
>> Regards
>>
>> Antoine.
>>
>>
>> Le 19/03/2019 à 02:51, Wes McKinney a écrit :
>>> hi folks,
>>>
>>> I think we're basically at the 0.13 end game here. There's some more
>>> patches can get in, but do we all think we can cut an RC by the end of
>>> the week? What are the blocking issues?
>>>
>>> Thanks
>>> Wes
>>>
>>> On Sat, Mar 16, 2019 at 9:57 PM Kouhei Sutou  wrote:

 Hi,

> Submitted the packaging builds:
>
>> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93=build-452

 I've fixed .deb/.rpm packages:
>> https://github.com/apache/arrow/pull/3934
 It has been merged.
 So .deb/.rpm packages are ready for release.

 Thanks,
 --
 kou

 In 
   "Re: Timeline for 0.13 Arrow release" on Thu, 14 Mar 2019 16:24:43
>> +0100,
   Krisztián Szűcs  wrote:

> Submitted the packaging builds:
>
>> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93=build-452
>
> On Thu, Mar 14, 2019 at 4:19 PM Wes McKinney 
>> wrote:
>
>> The CMake refactor is merged! Kudos to Uwe for 3+ weeks of hard labor
>> on
>> this.
>>
>> We should run all the packaging tasks and get a full accounting of
>> what is broken so we aren't surprised during the release process
>>
>> On Wed, Mar 13, 2019 at 9:39 AM Krisztián Szűcs
>>  wrote:
>>>
>>> The proof of the pudding is in the eating. You convinced me.
>>>
>>> On Wed, Mar 13, 2019 at 3:31 PM Wes McKinney 
>> wrote:
>>>
 Krisztian -- are you all right with proceeding with merging the
>> CMake
 refactor? I'm pretty committed to helping fix the problems that come
 up. Since most consumers of the project don't test until _after_ a
 release, we won't find out about some problems until we merge it and
 release it. Thus, IMHO it doesn't make sense to wait another 8-10
 weeks since we'd be delaying feedback for that long. There are also
>> a
 number of follow-on issues blocking on the refactor

 On Tue, Mar 12, 2019 at 11:39 AM Andy Grove 
>> wrote:
>
> I've cleaned up my issues for Rust, moving most of them to 0.14.0.
>
> I have two PRs in progress that I would appreciate reviews on:
>
> https://github.com/apache/arrow/pull/3671 - [Rust] Table API
>> (a.k.a
> DataFrame)
>
> https://github.com/apache/arrow/pull/3851 - [Rust] Parquet data
>> source
 in
> DataFusion
>
> Once these are merged I have some small follow up PRs for 0.13.0
>> that I
 can
> get done this week.
>
> Thanks,
>
> Andy.
>
>
> On Tue, Mar 12, 2019 at 8:21 AM Wes McKinney 
 wrote:
>
>> hi folks,
>>
>> I think we are on track to be able to release toward the end of
>> this
>> month. My proposed timeline:
>>
>> * This week (March 11-15): feature/improvement push mostly
>> * Next week (March 18-22): shift to bug fixes, stabilization,
>> empty
>> backlog of feature/improvement JIRAs
>> * Week of March 25: propose release candidate
>>
>> Does this seem reasonable? This puts us at about 9-10 weeks from
>> 0.12.
>>
>> We need an RM for 0.13, any PMCs want to volunteer?
>>
>> Take a look at our release page:
>>
>>

>>
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103091219
>>
>> Out of the open or in-progress issues, we have:
>>
>> * C#: 3 issues
>> * C++ (all components): 51 issues
>> * Java: 3 issues
>> * Python: 38 issues
>> * Rust (all components): 33 issues
>>
>> Please help curating the backlogs for each component. There's a
>> smattering of issues in other categories. There are also 10 open
>> issues with No Component (and 20 resolved issues), those need
>> their
>> metadata fixed.
>>
>> Thanks,
>> Wes
>>
>> On Wed, Feb 27, 2019 at 1:49 PM Wes McKinney >>
 wrote:
>>>
>>> The timeline for the 0.13 release is drawing closer. 

Re: Timeline for 0.13 Arrow release

2019-03-19 Thread Neville Dipale
When is the cut-off for PRs? We have a public holiday on Thursday, and I
want to use that to finish off my work on array casting.

If that'll be too late I can defer to the next release.

On Tue, 19 Mar 2019, 12:34 Antoine Pitrou,  wrote:

>
> The only potential blocker from my POV is
> https://issues.apache.org/jira/browse/ARROW-3578, but we've already
> lived with it for previous releases, so perhaps it's ok anyway?
>
> Regards
>
> Antoine.
>
>
> Le 19/03/2019 à 02:51, Wes McKinney a écrit :
> > hi folks,
> >
> > I think we're basically at the 0.13 end game here. There's some more
> > patches can get in, but do we all think we can cut an RC by the end of
> > the week? What are the blocking issues?
> >
> > Thanks
> > Wes
> >
> > On Sat, Mar 16, 2019 at 9:57 PM Kouhei Sutou  wrote:
> >>
> >> Hi,
> >>
> >>> Submitted the packaging builds:
> >>>
> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93=build-452
> >>
> >> I've fixed .deb/.rpm packages:
> https://github.com/apache/arrow/pull/3934
> >> It has been merged.
> >> So .deb/.rpm packages are ready for release.
> >>
> >> Thanks,
> >> --
> >> kou
> >>
> >> In 
> >>   "Re: Timeline for 0.13 Arrow release" on Thu, 14 Mar 2019 16:24:43
> +0100,
> >>   Krisztián Szűcs  wrote:
> >>
> >>> Submitted the packaging builds:
> >>>
> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93=build-452
> >>>
> >>> On Thu, Mar 14, 2019 at 4:19 PM Wes McKinney 
> wrote:
> >>>
>  The CMake refactor is merged! Kudos to Uwe for 3+ weeks of hard labor
> on
>  this.
> 
>  We should run all the packaging tasks and get a full accounting of
>  what is broken so we aren't surprised during the release process
> 
>  On Wed, Mar 13, 2019 at 9:39 AM Krisztián Szűcs
>   wrote:
> >
> > The proof of the pudding is in the eating. You convinced me.
> >
> > On Wed, Mar 13, 2019 at 3:31 PM Wes McKinney 
>  wrote:
> >
> >> Krisztian -- are you all right with proceeding with merging the
> CMake
> >> refactor? I'm pretty committed to helping fix the problems that come
> >> up. Since most consumers of the project don't test until _after_ a
> >> release, we won't find out about some problems until we merge it and
> >> release it. Thus, IMHO it doesn't make sense to wait another 8-10
> >> weeks since we'd be delaying feedback for that long. There are also
> a
> >> number of follow-on issues blocking on the refactor
> >>
> >> On Tue, Mar 12, 2019 at 11:39 AM Andy Grove 
>  wrote:
> >>>
> >>> I've cleaned up my issues for Rust, moving most of them to 0.14.0.
> >>>
> >>> I have two PRs in progress that I would appreciate reviews on:
> >>>
> >>> https://github.com/apache/arrow/pull/3671 - [Rust] Table API
> (a.k.a
> >>> DataFrame)
> >>>
> >>> https://github.com/apache/arrow/pull/3851 - [Rust] Parquet data
>  source
> >> in
> >>> DataFusion
> >>>
> >>> Once these are merged I have some small follow up PRs for 0.13.0
>  that I
> >> can
> >>> get done this week.
> >>>
> >>> Thanks,
> >>>
> >>> Andy.
> >>>
> >>>
> >>> On Tue, Mar 12, 2019 at 8:21 AM Wes McKinney 
> >> wrote:
> >>>
>  hi folks,
> 
>  I think we are on track to be able to release toward the end of
>  this
>  month. My proposed timeline:
> 
>  * This week (March 11-15): feature/improvement push mostly
>  * Next week (March 18-22): shift to bug fixes, stabilization,
> empty
>  backlog of feature/improvement JIRAs
>  * Week of March 25: propose release candidate
> 
>  Does this seem reasonable? This puts us at about 9-10 weeks from
>  0.12.
> 
>  We need an RM for 0.13, any PMCs want to volunteer?
> 
>  Take a look at our release page:
> 
> 
> >>
> 
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103091219
> 
>  Out of the open or in-progress issues, we have:
> 
>  * C#: 3 issues
>  * C++ (all components): 51 issues
>  * Java: 3 issues
>  * Python: 38 issues
>  * Rust (all components): 33 issues
> 
>  Please help curating the backlogs for each component. There's a
>  smattering of issues in other categories. There are also 10 open
>  issues with No Component (and 20 resolved issues), those need
> their
>  metadata fixed.
> 
>  Thanks,
>  Wes
> 
>  On Wed, Feb 27, 2019 at 1:49 PM Wes McKinney  >
> >> wrote:
> >
> > The timeline for the 0.13 release is drawing closer. I would say
>  we
> > should consider a release candidate either the week of March 18
>  or
> > March 25, which gives us ~3 weeks to close out backlog items.
> >
> > There are around 220 issues open 

Re: Timeline for 0.13 Arrow release

2019-03-19 Thread Antoine Pitrou


The only potential blocker from my POV is
https://issues.apache.org/jira/browse/ARROW-3578, but we've already
lived with it for previous releases, so perhaps it's ok anyway?

Regards

Antoine.


Le 19/03/2019 à 02:51, Wes McKinney a écrit :
> hi folks,
> 
> I think we're basically at the 0.13 end game here. There's some more
> patches can get in, but do we all think we can cut an RC by the end of
> the week? What are the blocking issues?
> 
> Thanks
> Wes
> 
> On Sat, Mar 16, 2019 at 9:57 PM Kouhei Sutou  wrote:
>>
>> Hi,
>>
>>> Submitted the packaging builds:
>>> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93=build-452
>>
>> I've fixed .deb/.rpm packages: https://github.com/apache/arrow/pull/3934
>> It has been merged.
>> So .deb/.rpm packages are ready for release.
>>
>> Thanks,
>> --
>> kou
>>
>> In 
>>   "Re: Timeline for 0.13 Arrow release" on Thu, 14 Mar 2019 16:24:43 +0100,
>>   Krisztián Szűcs  wrote:
>>
>>> Submitted the packaging builds:
>>> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93=build-452
>>>
>>> On Thu, Mar 14, 2019 at 4:19 PM Wes McKinney  wrote:
>>>
 The CMake refactor is merged! Kudos to Uwe for 3+ weeks of hard labor on
 this.

 We should run all the packaging tasks and get a full accounting of
 what is broken so we aren't surprised during the release process

 On Wed, Mar 13, 2019 at 9:39 AM Krisztián Szűcs
  wrote:
>
> The proof of the pudding is in the eating. You convinced me.
>
> On Wed, Mar 13, 2019 at 3:31 PM Wes McKinney 
 wrote:
>
>> Krisztian -- are you all right with proceeding with merging the CMake
>> refactor? I'm pretty committed to helping fix the problems that come
>> up. Since most consumers of the project don't test until _after_ a
>> release, we won't find out about some problems until we merge it and
>> release it. Thus, IMHO it doesn't make sense to wait another 8-10
>> weeks since we'd be delaying feedback for that long. There are also a
>> number of follow-on issues blocking on the refactor
>>
>> On Tue, Mar 12, 2019 at 11:39 AM Andy Grove 
 wrote:
>>>
>>> I've cleaned up my issues for Rust, moving most of them to 0.14.0.
>>>
>>> I have two PRs in progress that I would appreciate reviews on:
>>>
>>> https://github.com/apache/arrow/pull/3671 - [Rust] Table API (a.k.a
>>> DataFrame)
>>>
>>> https://github.com/apache/arrow/pull/3851 - [Rust] Parquet data
 source
>> in
>>> DataFusion
>>>
>>> Once these are merged I have some small follow up PRs for 0.13.0
 that I
>> can
>>> get done this week.
>>>
>>> Thanks,
>>>
>>> Andy.
>>>
>>>
>>> On Tue, Mar 12, 2019 at 8:21 AM Wes McKinney 
>> wrote:
>>>
 hi folks,

 I think we are on track to be able to release toward the end of
 this
 month. My proposed timeline:

 * This week (March 11-15): feature/improvement push mostly
 * Next week (March 18-22): shift to bug fixes, stabilization, empty
 backlog of feature/improvement JIRAs
 * Week of March 25: propose release candidate

 Does this seem reasonable? This puts us at about 9-10 weeks from
 0.12.

 We need an RM for 0.13, any PMCs want to volunteer?

 Take a look at our release page:


>>
 https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103091219

 Out of the open or in-progress issues, we have:

 * C#: 3 issues
 * C++ (all components): 51 issues
 * Java: 3 issues
 * Python: 38 issues
 * Rust (all components): 33 issues

 Please help curating the backlogs for each component. There's a
 smattering of issues in other categories. There are also 10 open
 issues with No Component (and 20 resolved issues), those need their
 metadata fixed.

 Thanks,
 Wes

 On Wed, Feb 27, 2019 at 1:49 PM Wes McKinney 
>> wrote:
>
> The timeline for the 0.13 release is drawing closer. I would say
 we
> should consider a release candidate either the week of March 18
 or
> March 25, which gives us ~3 weeks to close out backlog items.
>
> There are around 220 issues open or in-progress in
>
>
>> https://cwiki.apache.org/confluence/display/ARROW/Arrow+0.13.0+Release
>
> Please have a look. If issues are not assigned to someone as the
 next
> couple of weeks pass by I'll begin moving at least C++ and Python
> issues to 0.14 that don't seem like they're going to get done for
> 0.13. If development stakeholders for C#, Java, Rust, Ruby, and
 other
> components can review and curate the issues that would be
 helpful.
>

[jira] [Created] (ARROW-4960) [R] Add crossbow task for r-arrow-feedstock

2019-03-19 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-4960:
--

 Summary: [R] Add crossbow task for r-arrow-feedstock
 Key: ARROW-4960
 URL: https://issues.apache.org/jira/browse/ARROW-4960
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Packaging, R
Reporter: Uwe L. Korn
 Fix For: 0.14.0


We also have an R package on conda-forge now: 
[https://github.com/conda-forge/r-arrow-feedstock] This should be tested using 
crossbow as we do with the other packages.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Timeline for 0.13 Arrow release

2019-03-19 Thread Neville Dipale
Thanks Chao, I've provided details to reproduce in the array.rs unit tests

On Tue, 19 Mar 2019 at 06:24, Chao Sun  wrote:

> Neville, I think we should be able to fix the two bugs you mentioned within
> this week. I'll take a look. It would be great if you can provide more
> details in the JIRAs (e.g., test case to reproduce). Array currently
> doesn't expose a bitmask API, and I don't think we need specialized
> implementations for struct & list.
>
> Chao
>
> On Mon, Mar 18, 2019 at 8:24 PM Neville Dipale 
> wrote:
>
> > Hi Wes,
> >
> > In Rust, we have 2 bugs (
> https://issues.apache.org/jira/browse/ARROW-4914,
> > https://issues.apache.org/jira/browse/ARROW-4886) both related to array
> > slicing.
> >
> > In summary:
> >
> > * ARROW-4914, the bitmask of the original array is used to determine the
> > validity of the sliced array, but offsets aren't read correctly. An array
> > with 10111 sliced with (offset=2, len=3) will return bitmask of 101
> instead
> > of 111
> > * ARROW-4886, we implemented slice on the Array interface, but don't have
> > specialised implementations for struct and list, so we leak the
> > implementation.
> >
> > I think if we can't get to both by the time we release an RC, the best
> > solution would be to revert
> > https://issues.apache.org/jira/browse/ARROW-3954
> > .
> >
> > Any thoughts from Rust commiters?
> >
> > Neville
> >
> > On Tue, 19 Mar 2019, 03:51 Wes McKinney,  wrote:
> >
> > > hi folks,
> > >
> > > I think we're basically at the 0.13 end game here. There's some more
> > > patches can get in, but do we all think we can cut an RC by the end of
> > > the week? What are the blocking issues?
> > >
> > > Thanks
> > > Wes
> > >
> > > On Sat, Mar 16, 2019 at 9:57 PM Kouhei Sutou 
> wrote:
> > > >
> > > > Hi,
> > > >
> > > > > Submitted the packaging builds:
> > > > >
> > >
> >
> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93=build-452
> > > >
> > > > I've fixed .deb/.rpm packages:
> > https://github.com/apache/arrow/pull/3934
> > > > It has been merged.
> > > > So .deb/.rpm packages are ready for release.
> > > >
> > > > Thanks,
> > > > --
> > > > kou
> > > >
> > > > In <
> cahm19a5somzxgcphc6ee-mr2usvvhwb252udgjrvocq-cb2...@mail.gmail.com
> > >
> > > >   "Re: Timeline for 0.13 Arrow release" on Thu, 14 Mar 2019 16:24:43
> > > +0100,
> > > >   Krisztián Szűcs  wrote:
> > > >
> > > > > Submitted the packaging builds:
> > > > >
> > >
> >
> https://github.com/kszucs/crossbow/branches/all?utf8=%E2%9C%93=build-452
> > > > >
> > > > > On Thu, Mar 14, 2019 at 4:19 PM Wes McKinney 
> > > wrote:
> > > > >
> > > > >> The CMake refactor is merged! Kudos to Uwe for 3+ weeks of hard
> > labor
> > > on
> > > > >> this.
> > > > >>
> > > > >> We should run all the packaging tasks and get a full accounting of
> > > > >> what is broken so we aren't surprised during the release process
> > > > >>
> > > > >> On Wed, Mar 13, 2019 at 9:39 AM Krisztián Szűcs
> > > > >>  wrote:
> > > > >> >
> > > > >> > The proof of the pudding is in the eating. You convinced me.
> > > > >> >
> > > > >> > On Wed, Mar 13, 2019 at 3:31 PM Wes McKinney <
> wesmck...@gmail.com
> > >
> > > > >> wrote:
> > > > >> >
> > > > >> > > Krisztian -- are you all right with proceeding with merging
> the
> > > CMake
> > > > >> > > refactor? I'm pretty committed to helping fix the problems
> that
> > > come
> > > > >> > > up. Since most consumers of the project don't test until
> > _after_ a
> > > > >> > > release, we won't find out about some problems until we merge
> it
> > > and
> > > > >> > > release it. Thus, IMHO it doesn't make sense to wait another
> > 8-10
> > > > >> > > weeks since we'd be delaying feedback for that long. There are
> > > also a
> > > > >> > > number of follow-on issues blocking on the refactor
> > > > >> > >
> > > > >> > > On Tue, Mar 12, 2019 at 11:39 AM Andy Grove <
> > > andygrov...@gmail.com>
> > > > >> wrote:
> > > > >> > > >
> > > > >> > > > I've cleaned up my issues for Rust, moving most of them to
> > > 0.14.0.
> > > > >> > > >
> > > > >> > > > I have two PRs in progress that I would appreciate reviews
> on:
> > > > >> > > >
> > > > >> > > > https://github.com/apache/arrow/pull/3671 - [Rust] Table
> API
> > > (a.k.a
> > > > >> > > > DataFrame)
> > > > >> > > >
> > > > >> > > > https://github.com/apache/arrow/pull/3851 - [Rust] Parquet
> > data
> > > > >> source
> > > > >> > > in
> > > > >> > > > DataFusion
> > > > >> > > >
> > > > >> > > > Once these are merged I have some small follow up PRs for
> > 0.13.0
> > > > >> that I
> > > > >> > > can
> > > > >> > > > get done this week.
> > > > >> > > >
> > > > >> > > > Thanks,
> > > > >> > > >
> > > > >> > > > Andy.
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > On Tue, Mar 12, 2019 at 8:21 AM Wes McKinney <
> > > wesmck...@gmail.com>
> > > > >> > > wrote:
> > > > >> > > >
> > > > >> > > > > hi folks,
> > > > >> > > > >
> > > > >> > > > > I think we are on track to be able to release toward the
> end
> > > of
> > > > >> this
> 

[DISCUSS][Format] Time Interval Changes

2019-03-19 Thread Micah Kornfield
Hi Arrow Dev,
Based on the recent thread on discussing and voting on changes to files
under format, I'd figure I'd try see how the process works for changes to
Schema.fbs to close out lingering time interval issues.  In particular,
ARROW-352 (Interval(DAY_TIME) has no unit) and ARROW-835 (Add Timedelta
type to describe time intervals).

I submitted a PR [1] that introduces a new DurationType that models
(sub)seconds (excluding leap seconds) as a 8-byte integer type.  Some of
these issues have been discussed previously, the most recent thread was
within the last month [2].

The reason for creating a new type is to avoid breaking changes with
existing types (in particular Interval[DAY_TIME] in Java).I think
things worth discussing are:

1.  Is this a desirable change in principle?
2.  Naming: is DurationInterval a good name (should it be TimeDelta)?
3.  New Type: Should this be collapsed as a new enum on Interval (because
it excludes leap-seconds, I think it still technically falls into the class
of Calendar like objects).

Please feel free to add items for discussion.

I'm not sure the typical time that discussions are held open for, but it
would be great if we could try to get to a consensus sometime soon (and
then schedule a vote).  Maybe early next week is a good goal to aim for?

Thanks,
Micah


[1] https://github.com/apache/arrow/pull/3644
[2]
https://lists.apache.org/thread.html/0e606a6afd2332b4ae5b4382e533bea309c790ea71c05047cf983372@%3Cdev.arrow.apache.org%3E


[jira] [Created] (ARROW-4959) [Gandiva][Crossbow] Builds broken

2019-03-19 Thread Praveen Kumar Desabandu (JIRA)
Praveen Kumar Desabandu created ARROW-4959:
--

 Summary: [Gandiva][Crossbow] Builds broken
 Key: ARROW-4959
 URL: https://issues.apache.org/jira/browse/ARROW-4959
 Project: Apache Arrow
  Issue Type: Task
Reporter: Praveen Kumar Desabandu
Assignee: Praveen Kumar Desabandu


Looks like cross bow builds for Gandiva is broken for the last few days.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)