Re: [DISCUSS] Concerns about the Arrow Slack channel

2018-06-21 Thread Wes McKinney
It's sort of unrelated to this conversation, but since someone
mentioned MXNet I want to call attention to a thread on their podling
mailing list about JIRA vs. GitHub issues:
https://lists.apache.org/thread.html/b4d174223d68c5822ea538f2609281c8023c7cc1eaef298bb2c4c186@%3Cdev.mxnet.apache.org%3E.

To summarize the thread: a lot of people don't like change, but
sometimes change is good. Some people have complained privately to me
that Arrow doesn't work like any other random project on GitHub.
Anyone who doesn't contribute to the project on account of that is,
IMHO, not a serious contributor.

I personally find JIRA to be an excellent tool, but it's a steeper
learning curve than GitHub and so it does take a bit of effort to
learn its features.

- Wes


On Thu, Jun 21, 2018 at 11:05 PM, Wes McKinney  wrote:
> Thanks all. I'm intrigued by Discourse (for some reason I keep typing
> "discourge.org"); we should inquire with ASF infra to see if they
> would be willing to support it for us. It's important that we develop
> a public record for the project, and for that data to be archived and
> indexed in some place that is owned by the ASF. I'm frankly -0 on
> having a fourth communication channel (outside of e-mail/JIRA/GitHub)
> since three is already a lot to keep track of. If we had a larger
> maintainer team, I might feel differently.
>
> Travis had some questions about GitHub and JIRA. JIRA is the only
> system of record for concrete development activity in the project. We
> use GitHub pull requests to submit patches (some projects use Gerrit,
> or attach patch files to JIRA), but all of the data generated on these
> PRs (code review comments, etc.) is mirrored back to JIRA.
> Furthermore, JIRA activity is relayed to the iss...@arrow.apache.org
> mailing list. So ultimately we have a public record for the project on
> mailing lists.
>
> Many newcomers have never interacted with an Apache project before,
> and so when they go to http://github.com/apache/arrow their first
> reaction is to look for the Issues tab to report a bug or ask for
> something. For a long time we didn't have issues turned on, and we
> found that people were "bouncing" rather than seeking out the mailing
> list or JIRA. We'd rather capture the information somewhere rather
> than lose it. We have an issue template asking people to either use
> the mailing list or JIRA, but a lot of people ignore it unfortunately:
> https://github.com/apache/arrow/blob/master/.github/ISSUE_TEMPLATE.md.
>
> - Wes
>
> On Thu, Jun 21, 2018 at 10:52 PM, Kenta Murata  wrote:
>> Hi everyone,
>>
>> I heard from Kou that you’re discussing to stop using Slack.
>> So I want to propose another way to use Discourse.
>>
>> On 2018/06/21 18:46:54, Dhruv Madeka  wrote:
>>> The issue with discourse is that you either have to host it or pay for them
>>> to host it
>>
>> Discourse provides free hosting plan for community friendly opensource 
>> projects.
>> See this article for the details:
>> 
>>
>>> but still +1 for discourse, its a really nice format (I actually +1'ed the
>>> PyTorch forum on this thread too)
>>
>> I’m also +1 for discourse because I’m managing 
>> https://discourse.ruby-data.org/ by this plan.
>>
>>
>> Regards,
>> Kenta Murata


Re: [DISCUSS] Concerns about the Arrow Slack channel

2018-06-21 Thread Wes McKinney
Thanks all. I'm intrigued by Discourse (for some reason I keep typing
"discourge.org"); we should inquire with ASF infra to see if they
would be willing to support it for us. It's important that we develop
a public record for the project, and for that data to be archived and
indexed in some place that is owned by the ASF. I'm frankly -0 on
having a fourth communication channel (outside of e-mail/JIRA/GitHub)
since three is already a lot to keep track of. If we had a larger
maintainer team, I might feel differently.

Travis had some questions about GitHub and JIRA. JIRA is the only
system of record for concrete development activity in the project. We
use GitHub pull requests to submit patches (some projects use Gerrit,
or attach patch files to JIRA), but all of the data generated on these
PRs (code review comments, etc.) is mirrored back to JIRA.
Furthermore, JIRA activity is relayed to the iss...@arrow.apache.org
mailing list. So ultimately we have a public record for the project on
mailing lists.

Many newcomers have never interacted with an Apache project before,
and so when they go to http://github.com/apache/arrow their first
reaction is to look for the Issues tab to report a bug or ask for
something. For a long time we didn't have issues turned on, and we
found that people were "bouncing" rather than seeking out the mailing
list or JIRA. We'd rather capture the information somewhere rather
than lose it. We have an issue template asking people to either use
the mailing list or JIRA, but a lot of people ignore it unfortunately:
https://github.com/apache/arrow/blob/master/.github/ISSUE_TEMPLATE.md.

- Wes

On Thu, Jun 21, 2018 at 10:52 PM, Kenta Murata  wrote:
> Hi everyone,
>
> I heard from Kou that you’re discussing to stop using Slack.
> So I want to propose another way to use Discourse.
>
> On 2018/06/21 18:46:54, Dhruv Madeka  wrote:
>> The issue with discourse is that you either have to host it or pay for them
>> to host it
>
> Discourse provides free hosting plan for community friendly opensource 
> projects.
> See this article for the details:
> 
>
>> but still +1 for discourse, its a really nice format (I actually +1'ed the
>> PyTorch forum on this thread too)
>
> I’m also +1 for discourse because I’m managing 
> https://discourse.ruby-data.org/ by this plan.
>
>
> Regards,
> Kenta Murata


Re: [DISCUSS] Concerns about the Arrow Slack channel

2018-06-21 Thread Kenta Murata
Hi everyone,

I heard from Kou that you’re discussing to stop using Slack.
So I want to propose another way to use Discourse.

On 2018/06/21 18:46:54, Dhruv Madeka  wrote:
> The issue with discourse is that you either have to host it or pay for them
> to host it

Discourse provides free hosting plan for community friendly opensource projects.
See this article for the details:


> but still +1 for discourse, its a really nice format (I actually +1'ed the
> PyTorch forum on this thread too)

I’m also +1 for discourse because I’m managing https://discourse.ruby-data.org/ 
by this plan.


Regards,
Kenta Murata


Re: Gandiva Initiative

2018-06-21 Thread Wes McKinney
hi Jacques,

This is very exciting! LLVM codegen for Arrow has been on my wishlist
since the early days of the project. I always considered it more of a
"when" question more than "if".

I will take a closer look at the codebase to make some comments, but
my biggest initial question is whether we could work to make Gandiva
the official community-supported LLVM framework for creating
JIT-compiled Arrow kernels. In the Ursa Labs (a new lab I am building
to focus 90+% on Apache Arrow development) tech roadmap we discussed
the need for a subgraph compiler using LLVM:
https://ursalabs.org/tech/#subgraph-compilation-code-generation.

I would be interesting in getting involved in the project, and I
expect in time many others will, as well. An obvious question would be
whether you would be interested in donating the project to Apache
Arrow and continuing the work there. We would benefit from common
build, testing/CI, and packaging/deployment infrastructure. I'm keen
to see JIT-powered predicate pushdown in Parquet files, for example.
Phillip and I could look into building a Gandiva backend for compiling
a subset of expressions originating from Ibis, a lazy-evaluation DSL
system with similar API to pandas
(https://github.com/ibis-project/ibis).

best
Wes

On Thu, Jun 21, 2018 at 4:13 PM, Dimitri Vorona
 wrote:
> Hey Jaques,
>
> Great stuff! I'm actually researching the integration of arrow and flight
> into a main memory database which also uses LLVM for dynamic query
> generation! Excited to have a more detailed look at Gandiva!
>
> Cheers,
> Dimitri.
>
> On Thu, Jun 21, 2018, 21:15 Jacques Nadeau  wrote:
>
>> Hey Guys,
>>
>> Dremio just open sourced a new framework for processing data in Arrow data
>> structures [1], built on top of the Apache Arrow C++ APIs and leveraging
>> LLVM (Apache licensed). It also includes Java APIs that leverage the Apache
>> Arrow Java libraries. I expect the developers who have been working on this
>> will introduce themselves soon. To read more about it, take a look at our
>> Ravindra's blog post (he's the lead developer driving this work): [2].
>> Hopefully people will find this interesting/useful.
>>
>> Let us know what you all think!
>>
>> thanks,
>> Jacques
>>
>>
>> [1] https://github.com/dremio/gandiva
>> [2] https://www.dremio.com/announcing-gandiva-initiative-for-apache-arrow/
>>


Gandiva Initiative

2018-06-21 Thread Jacques Nadeau
Hey Guys,

Dremio just open sourced a new framework for processing data in Arrow data
structures [1], built on top of the Apache Arrow C++ APIs and leveraging
LLVM (Apache licensed). It also includes Java APIs that leverage the Apache
Arrow Java libraries. I expect the developers who have been working on this
will introduce themselves soon. To read more about it, take a look at our
Ravindra's blog post (he's the lead developer driving this work): [2].
Hopefully people will find this interesting/useful.

Let us know what you all think!

thanks,
Jacques


[1] https://github.com/dremio/gandiva
[2] https://www.dremio.com/announcing-gandiva-initiative-for-apache-arrow/


Re: [DISCUSS] Concerns about the Arrow Slack channel

2018-06-21 Thread Dhruv Madeka
The issue with discourse is that you either have to host it or pay for them
to host it

but still +1 for discourse, its a really nice format (I actually +1'ed the
PyTorch forum on this thread too)

Dhruv

On Thu, Jun 21, 2018 at 2:09 PM, Travis Oliphant 
wrote:

> Hi everyone,
>
> I'll be chiming in from time to time as Anthony Scopatz and I help
> several people from Quansight become more integrated to community
> Arrow development.  I'm a fan of Arrow's goals and have similar goals
> for a cousin project called http://xnd.io.  I'm eager to find ways to
> collaborate on the compute infrastructure between the two projects, in
> particular, for example.
>
> Here is my $0.02 on this issue,
>
> Chat rooms can be a useful mechanism for engaging with new developers.
> However, Slack itself does not really allow for the kind of
> large-scale community participation that Gitter allows for.  If you
> have a chat room I recommend Gitter.
>
> All that said, I would personally favor a discourse
> (https://www.discourse.org/) solution over chat rooms.  I've noticed
> several younger folks not really liking the mailing lists and seeking
> out chat rooms first --- the success of pytorch, mxnet communities
> indicate that perhaps they could be encouraged to use something like
> discourse.
>
> On a related note, could someone help me understand the relationship
> between Github Issues and JIRA issues? Is one preferred?  I understand
> that contributions to the code are recommended as PRs on Github.  Does
> that mean a branch at Github is considered to be the primary
> repository, or is there another place that code has to go to be
> official?  I suspect this is all documented somewhere. I would welcome
> a simple link to the right place to read.
>
> Thank you,
>
> -Travis
>
>
> ---
> Travis Oliphant
> Quansight
>
>
> On Thu, Jun 21, 2018 at 3:25 AM, Wes McKinney  wrote:
> > hi all,
> >
> > I wanted to bring up some concerns I have about the Slack room hosted
> > at http://apachearrow.slack.com.
> >
> > Corporate communications have changed a lot in recent years with the
> > new wave of IRC-like chat systems such as HipChat and Slack. In many
> > companies, Slack has become a preferred form of communication over
> > e-mail or other asynchronous messaging tools. This trend is negatively
> > impacting Apache Arrow in some ways that I will explain.
> >
> > Initially we created the Arrow Slack channel as a means of secondary
> > communication, to facilitate real-time discussions and help build the
> > community. So people, particularly newcomers, are coming to the
> > project and seeing 4 ways to communicate:
> >
> > * dev@ Mailing list
> > * JIRA
> > * GitHub
> > * Slack
> >
> > As a result of broader trends in the world, they are electing to use
> > Slack as their first, primary channel to interact with the project.
> > This is bad for many reasons:
> >
> > * Slack is essentially private. While anyone can join Slack, chats are
> > not archived in any public place, nor are they searchable through
> > internet search portals. I do not think it meets the public
> > communication requirements of Apache projects in general
> > * We've exceeded the message limit for free Slack channels; upgrading
> > to a paid Slack plan for Apache Arrow, with 650+ members, would be
> > very expensive
> > * Only 3 out of the top 20 Arrow contributors (by # of commits) are
> > regularly on the Slack channel. I don't use Slack, for example, and I
> > would rather not be expected to
> > * We are geo-distributed in many time zones; even if we all used
> > Slack, synchronous/real-time chat to discuss the project is frequently
> > impractical
> >
> > Because of the "real-time" nature of IRC-like systems, people's
> > discussions and questions get intermingled, so keeping track of
> > longer-running discussions may be difficult. It's hard to know when
> > someone's question has been answered or whether people have
> > sufficiently discussed a particular topic.
> >
> > Many discussions or questions are by their nature asynchronous, and it
> > may take 24-72 hours or more for Arrow contributors to make a
> > thoughtful reply.
> >
> > As a result of all of this, we are missing opportunities to have
> > deeper discussions, develop the Arrow roadmap, create new JIRAs to
> > capture bug reports or feature requests, and other activities of
> > healthy open source communities. Additionally, the private nature of
> > Slack is causing organizational knowledge (particularly Q / FAQs) to
> > essentially be lost. Users with questions won't stumble on answers by
> > searching on Google (as they would with a mailing list or
> > StackOverflow).
> >
> > I don't think Slack is necessarily bad for users in a corporate
> > environment; in many companies it is expected that all people will
> > have the Slack client open at all times. This isn't the case here,
> > though.
> >
> > My strong preference in light of the activity I have been observing on
> > Slack (which I 

Re: [DISCUSS] Concerns about the Arrow Slack channel

2018-06-21 Thread Travis Oliphant
Hi everyone,

I'll be chiming in from time to time as Anthony Scopatz and I help
several people from Quansight become more integrated to community
Arrow development.  I'm a fan of Arrow's goals and have similar goals
for a cousin project called http://xnd.io.  I'm eager to find ways to
collaborate on the compute infrastructure between the two projects, in
particular, for example.

Here is my $0.02 on this issue,

Chat rooms can be a useful mechanism for engaging with new developers.
However, Slack itself does not really allow for the kind of
large-scale community participation that Gitter allows for.  If you
have a chat room I recommend Gitter.

All that said, I would personally favor a discourse
(https://www.discourse.org/) solution over chat rooms.  I've noticed
several younger folks not really liking the mailing lists and seeking
out chat rooms first --- the success of pytorch, mxnet communities
indicate that perhaps they could be encouraged to use something like
discourse.

On a related note, could someone help me understand the relationship
between Github Issues and JIRA issues? Is one preferred?  I understand
that contributions to the code are recommended as PRs on Github.  Does
that mean a branch at Github is considered to be the primary
repository, or is there another place that code has to go to be
official?  I suspect this is all documented somewhere. I would welcome
a simple link to the right place to read.

Thank you,

-Travis


---
Travis Oliphant
Quansight


On Thu, Jun 21, 2018 at 3:25 AM, Wes McKinney  wrote:
> hi all,
>
> I wanted to bring up some concerns I have about the Slack room hosted
> at http://apachearrow.slack.com.
>
> Corporate communications have changed a lot in recent years with the
> new wave of IRC-like chat systems such as HipChat and Slack. In many
> companies, Slack has become a preferred form of communication over
> e-mail or other asynchronous messaging tools. This trend is negatively
> impacting Apache Arrow in some ways that I will explain.
>
> Initially we created the Arrow Slack channel as a means of secondary
> communication, to facilitate real-time discussions and help build the
> community. So people, particularly newcomers, are coming to the
> project and seeing 4 ways to communicate:
>
> * dev@ Mailing list
> * JIRA
> * GitHub
> * Slack
>
> As a result of broader trends in the world, they are electing to use
> Slack as their first, primary channel to interact with the project.
> This is bad for many reasons:
>
> * Slack is essentially private. While anyone can join Slack, chats are
> not archived in any public place, nor are they searchable through
> internet search portals. I do not think it meets the public
> communication requirements of Apache projects in general
> * We've exceeded the message limit for free Slack channels; upgrading
> to a paid Slack plan for Apache Arrow, with 650+ members, would be
> very expensive
> * Only 3 out of the top 20 Arrow contributors (by # of commits) are
> regularly on the Slack channel. I don't use Slack, for example, and I
> would rather not be expected to
> * We are geo-distributed in many time zones; even if we all used
> Slack, synchronous/real-time chat to discuss the project is frequently
> impractical
>
> Because of the "real-time" nature of IRC-like systems, people's
> discussions and questions get intermingled, so keeping track of
> longer-running discussions may be difficult. It's hard to know when
> someone's question has been answered or whether people have
> sufficiently discussed a particular topic.
>
> Many discussions or questions are by their nature asynchronous, and it
> may take 24-72 hours or more for Arrow contributors to make a
> thoughtful reply.
>
> As a result of all of this, we are missing opportunities to have
> deeper discussions, develop the Arrow roadmap, create new JIRAs to
> capture bug reports or feature requests, and other activities of
> healthy open source communities. Additionally, the private nature of
> Slack is causing organizational knowledge (particularly Q / FAQs) to
> essentially be lost. Users with questions won't stumble on answers by
> searching on Google (as they would with a mailing list or
> StackOverflow).
>
> I don't think Slack is necessarily bad for users in a corporate
> environment; in many companies it is expected that all people will
> have the Slack client open at all times. This isn't the case here,
> though.
>
> My strong preference in light of the activity I have been observing on
> Slack (which I encourage you to explore yourselves) would be to close
> the channel and direct discussions or questions take place on the
> mailing list, JIRA, or GitHub (all of which are archived on one or
> more ASF mailing lists). Since migrating to Gitbox, we have enabled
> GitHub issues on the repository, which has helped lower the barrier
> for newcomers, but a large percentage of the time GitHub issues would
> be better as JIRA issues or e-mails (which is what the GitHub 

Re: [DISCUSS] Concerns about the Arrow Slack channel

2018-06-21 Thread Ivan Ogasawara
+1 for dropping slack

On Thu, Jun 21, 2018 at 11:54 AM, Anthony Scopatz  wrote:

> I am +1 for dropping slack as well, for all of the reasons mentioned.
>
> On Thu, Jun 21, 2018 at 11:53 AM Dhruv Madeka  wrote:
>
> > Here's my best guess, emails force everyone on the list to read it, so
> they
> > have to meet a higher bar of importance?
> >
> > Pure guess there, im just channeling my experiences- I dont mind mailing
> > dev lists personally
> >
> > Dhruv
> >
> > On Thu, Jun 21, 2018 at 11:28 AM, Phillip Cloud 
> wrote:
> >
> > > Dhruv,
> > >
> > > I'm curious why the dev mailing list is considered intrusive. Can you
> > > expand a bit on that? I've always thought of mailing lists to be *the*
> > > place where people go to ask questions about a project in a way that is
> > > open to all. They are also archived and organized in some way that
> makes
> > it
> > > easy to go back and look at specific topics without having to piece
> > > together a topic's history from a large tapestry of interactions. If
> > > anything, I view chat as *more* intrusive since there's IMO an
> > expectation
> > > of a faster response given that chat is real-time.
> > >
> > > On Thu, Jun 21, 2018 at 11:13 AM Dhruv Madeka  wrote:
> > >
> > > > Not to jump in too randomly, but for jupyter-widgets/bqplot
> > > >  - we haven't found an optimal
> > > > solution to this.
> > > >
> > > > - The dev mailing list is often considered to be intrusive
> > > > - GitHub issues arent really used for simple questions or non-bug
> fixes
> > > > - Gitter remains our most popular source of questions, which suffers
> a
> > > lot
> > > > of the problems of Slack outlined in Wes' email
> > > >
> > > > We're considering discuss forums, specially after the large success
> of
> > > the
> > > > PyTorch  and MXNet <
> > > https://discuss.mxnet.io>
> > > > forums for building community, allowing comfort in asking simple
> > > questions
> > > > and being stored/googleable
> > > >
> > > > Dhruv
> > > >
> > > > On Thu, Jun 21, 2018 at 4:25 AM, Wes McKinney 
> > > wrote:
> > > >
> > > > > hi all,
> > > > >
> > > > > I wanted to bring up some concerns I have about the Slack room
> hosted
> > > > > at http://apachearrow.slack.com.
> > > > >
> > > > > Corporate communications have changed a lot in recent years with
> the
> > > > > new wave of IRC-like chat systems such as HipChat and Slack. In
> many
> > > > > companies, Slack has become a preferred form of communication over
> > > > > e-mail or other asynchronous messaging tools. This trend is
> > negatively
> > > > > impacting Apache Arrow in some ways that I will explain.
> > > > >
> > > > > Initially we created the Arrow Slack channel as a means of
> secondary
> > > > > communication, to facilitate real-time discussions and help build
> the
> > > > > community. So people, particularly newcomers, are coming to the
> > > > > project and seeing 4 ways to communicate:
> > > > >
> > > > > * dev@ Mailing list
> > > > > * JIRA
> > > > > * GitHub
> > > > > * Slack
> > > > >
> > > > > As a result of broader trends in the world, they are electing to
> use
> > > > > Slack as their first, primary channel to interact with the project.
> > > > > This is bad for many reasons:
> > > > >
> > > > > * Slack is essentially private. While anyone can join Slack, chats
> > are
> > > > > not archived in any public place, nor are they searchable through
> > > > > internet search portals. I do not think it meets the public
> > > > > communication requirements of Apache projects in general
> > > > > * We've exceeded the message limit for free Slack channels;
> upgrading
> > > > > to a paid Slack plan for Apache Arrow, with 650+ members, would be
> > > > > very expensive
> > > > > * Only 3 out of the top 20 Arrow contributors (by # of commits) are
> > > > > regularly on the Slack channel. I don't use Slack, for example,
> and I
> > > > > would rather not be expected to
> > > > > * We are geo-distributed in many time zones; even if we all used
> > > > > Slack, synchronous/real-time chat to discuss the project is
> > frequently
> > > > > impractical
> > > > >
> > > > > Because of the "real-time" nature of IRC-like systems, people's
> > > > > discussions and questions get intermingled, so keeping track of
> > > > > longer-running discussions may be difficult. It's hard to know when
> > > > > someone's question has been answered or whether people have
> > > > > sufficiently discussed a particular topic.
> > > > >
> > > > > Many discussions or questions are by their nature asynchronous, and
> > it
> > > > > may take 24-72 hours or more for Arrow contributors to make a
> > > > > thoughtful reply.
> > > > >
> > > > > As a result of all of this, we are missing opportunities to have
> > > > > deeper discussions, develop the Arrow roadmap, create new JIRAs to
> > > > > capture bug reports or feature requests, and other activities of
> > > > > healthy open source communities. 

Re: [DISCUSS] Concerns about the Arrow Slack channel

2018-06-21 Thread Anthony Scopatz
I am +1 for dropping slack as well, for all of the reasons mentioned.

On Thu, Jun 21, 2018 at 11:53 AM Dhruv Madeka  wrote:

> Here's my best guess, emails force everyone on the list to read it, so they
> have to meet a higher bar of importance?
>
> Pure guess there, im just channeling my experiences- I dont mind mailing
> dev lists personally
>
> Dhruv
>
> On Thu, Jun 21, 2018 at 11:28 AM, Phillip Cloud  wrote:
>
> > Dhruv,
> >
> > I'm curious why the dev mailing list is considered intrusive. Can you
> > expand a bit on that? I've always thought of mailing lists to be *the*
> > place where people go to ask questions about a project in a way that is
> > open to all. They are also archived and organized in some way that makes
> it
> > easy to go back and look at specific topics without having to piece
> > together a topic's history from a large tapestry of interactions. If
> > anything, I view chat as *more* intrusive since there's IMO an
> expectation
> > of a faster response given that chat is real-time.
> >
> > On Thu, Jun 21, 2018 at 11:13 AM Dhruv Madeka  wrote:
> >
> > > Not to jump in too randomly, but for jupyter-widgets/bqplot
> > >  - we haven't found an optimal
> > > solution to this.
> > >
> > > - The dev mailing list is often considered to be intrusive
> > > - GitHub issues arent really used for simple questions or non-bug fixes
> > > - Gitter remains our most popular source of questions, which suffers a
> > lot
> > > of the problems of Slack outlined in Wes' email
> > >
> > > We're considering discuss forums, specially after the large success of
> > the
> > > PyTorch  and MXNet <
> > https://discuss.mxnet.io>
> > > forums for building community, allowing comfort in asking simple
> > questions
> > > and being stored/googleable
> > >
> > > Dhruv
> > >
> > > On Thu, Jun 21, 2018 at 4:25 AM, Wes McKinney 
> > wrote:
> > >
> > > > hi all,
> > > >
> > > > I wanted to bring up some concerns I have about the Slack room hosted
> > > > at http://apachearrow.slack.com.
> > > >
> > > > Corporate communications have changed a lot in recent years with the
> > > > new wave of IRC-like chat systems such as HipChat and Slack. In many
> > > > companies, Slack has become a preferred form of communication over
> > > > e-mail or other asynchronous messaging tools. This trend is
> negatively
> > > > impacting Apache Arrow in some ways that I will explain.
> > > >
> > > > Initially we created the Arrow Slack channel as a means of secondary
> > > > communication, to facilitate real-time discussions and help build the
> > > > community. So people, particularly newcomers, are coming to the
> > > > project and seeing 4 ways to communicate:
> > > >
> > > > * dev@ Mailing list
> > > > * JIRA
> > > > * GitHub
> > > > * Slack
> > > >
> > > > As a result of broader trends in the world, they are electing to use
> > > > Slack as their first, primary channel to interact with the project.
> > > > This is bad for many reasons:
> > > >
> > > > * Slack is essentially private. While anyone can join Slack, chats
> are
> > > > not archived in any public place, nor are they searchable through
> > > > internet search portals. I do not think it meets the public
> > > > communication requirements of Apache projects in general
> > > > * We've exceeded the message limit for free Slack channels; upgrading
> > > > to a paid Slack plan for Apache Arrow, with 650+ members, would be
> > > > very expensive
> > > > * Only 3 out of the top 20 Arrow contributors (by # of commits) are
> > > > regularly on the Slack channel. I don't use Slack, for example, and I
> > > > would rather not be expected to
> > > > * We are geo-distributed in many time zones; even if we all used
> > > > Slack, synchronous/real-time chat to discuss the project is
> frequently
> > > > impractical
> > > >
> > > > Because of the "real-time" nature of IRC-like systems, people's
> > > > discussions and questions get intermingled, so keeping track of
> > > > longer-running discussions may be difficult. It's hard to know when
> > > > someone's question has been answered or whether people have
> > > > sufficiently discussed a particular topic.
> > > >
> > > > Many discussions or questions are by their nature asynchronous, and
> it
> > > > may take 24-72 hours or more for Arrow contributors to make a
> > > > thoughtful reply.
> > > >
> > > > As a result of all of this, we are missing opportunities to have
> > > > deeper discussions, develop the Arrow roadmap, create new JIRAs to
> > > > capture bug reports or feature requests, and other activities of
> > > > healthy open source communities. Additionally, the private nature of
> > > > Slack is causing organizational knowledge (particularly Q / FAQs)
> to
> > > > essentially be lost. Users with questions won't stumble on answers by
> > > > searching on Google (as they would with a mailing list or
> > > > StackOverflow).
> > > >
> > > > I don't think Slack is 

Re: [DISCUSS] Concerns about the Arrow Slack channel

2018-06-21 Thread Dhruv Madeka
Here's my best guess, emails force everyone on the list to read it, so they
have to meet a higher bar of importance?

Pure guess there, im just channeling my experiences- I dont mind mailing
dev lists personally

Dhruv

On Thu, Jun 21, 2018 at 11:28 AM, Phillip Cloud  wrote:

> Dhruv,
>
> I'm curious why the dev mailing list is considered intrusive. Can you
> expand a bit on that? I've always thought of mailing lists to be *the*
> place where people go to ask questions about a project in a way that is
> open to all. They are also archived and organized in some way that makes it
> easy to go back and look at specific topics without having to piece
> together a topic's history from a large tapestry of interactions. If
> anything, I view chat as *more* intrusive since there's IMO an expectation
> of a faster response given that chat is real-time.
>
> On Thu, Jun 21, 2018 at 11:13 AM Dhruv Madeka  wrote:
>
> > Not to jump in too randomly, but for jupyter-widgets/bqplot
> >  - we haven't found an optimal
> > solution to this.
> >
> > - The dev mailing list is often considered to be intrusive
> > - GitHub issues arent really used for simple questions or non-bug fixes
> > - Gitter remains our most popular source of questions, which suffers a
> lot
> > of the problems of Slack outlined in Wes' email
> >
> > We're considering discuss forums, specially after the large success of
> the
> > PyTorch  and MXNet <
> https://discuss.mxnet.io>
> > forums for building community, allowing comfort in asking simple
> questions
> > and being stored/googleable
> >
> > Dhruv
> >
> > On Thu, Jun 21, 2018 at 4:25 AM, Wes McKinney 
> wrote:
> >
> > > hi all,
> > >
> > > I wanted to bring up some concerns I have about the Slack room hosted
> > > at http://apachearrow.slack.com.
> > >
> > > Corporate communications have changed a lot in recent years with the
> > > new wave of IRC-like chat systems such as HipChat and Slack. In many
> > > companies, Slack has become a preferred form of communication over
> > > e-mail or other asynchronous messaging tools. This trend is negatively
> > > impacting Apache Arrow in some ways that I will explain.
> > >
> > > Initially we created the Arrow Slack channel as a means of secondary
> > > communication, to facilitate real-time discussions and help build the
> > > community. So people, particularly newcomers, are coming to the
> > > project and seeing 4 ways to communicate:
> > >
> > > * dev@ Mailing list
> > > * JIRA
> > > * GitHub
> > > * Slack
> > >
> > > As a result of broader trends in the world, they are electing to use
> > > Slack as their first, primary channel to interact with the project.
> > > This is bad for many reasons:
> > >
> > > * Slack is essentially private. While anyone can join Slack, chats are
> > > not archived in any public place, nor are they searchable through
> > > internet search portals. I do not think it meets the public
> > > communication requirements of Apache projects in general
> > > * We've exceeded the message limit for free Slack channels; upgrading
> > > to a paid Slack plan for Apache Arrow, with 650+ members, would be
> > > very expensive
> > > * Only 3 out of the top 20 Arrow contributors (by # of commits) are
> > > regularly on the Slack channel. I don't use Slack, for example, and I
> > > would rather not be expected to
> > > * We are geo-distributed in many time zones; even if we all used
> > > Slack, synchronous/real-time chat to discuss the project is frequently
> > > impractical
> > >
> > > Because of the "real-time" nature of IRC-like systems, people's
> > > discussions and questions get intermingled, so keeping track of
> > > longer-running discussions may be difficult. It's hard to know when
> > > someone's question has been answered or whether people have
> > > sufficiently discussed a particular topic.
> > >
> > > Many discussions or questions are by their nature asynchronous, and it
> > > may take 24-72 hours or more for Arrow contributors to make a
> > > thoughtful reply.
> > >
> > > As a result of all of this, we are missing opportunities to have
> > > deeper discussions, develop the Arrow roadmap, create new JIRAs to
> > > capture bug reports or feature requests, and other activities of
> > > healthy open source communities. Additionally, the private nature of
> > > Slack is causing organizational knowledge (particularly Q / FAQs) to
> > > essentially be lost. Users with questions won't stumble on answers by
> > > searching on Google (as they would with a mailing list or
> > > StackOverflow).
> > >
> > > I don't think Slack is necessarily bad for users in a corporate
> > > environment; in many companies it is expected that all people will
> > > have the Slack client open at all times. This isn't the case here,
> > > though.
> > >
> > > My strong preference in light of the activity I have been observing on
> > > Slack (which I encourage you to explore yourselves) would be to 

Re: [DISCUSS] Concerns about the Arrow Slack channel

2018-06-21 Thread Phillip Cloud
Dhruv,

I'm curious why the dev mailing list is considered intrusive. Can you
expand a bit on that? I've always thought of mailing lists to be *the*
place where people go to ask questions about a project in a way that is
open to all. They are also archived and organized in some way that makes it
easy to go back and look at specific topics without having to piece
together a topic's history from a large tapestry of interactions. If
anything, I view chat as *more* intrusive since there's IMO an expectation
of a faster response given that chat is real-time.

On Thu, Jun 21, 2018 at 11:13 AM Dhruv Madeka  wrote:

> Not to jump in too randomly, but for jupyter-widgets/bqplot
>  - we haven't found an optimal
> solution to this.
>
> - The dev mailing list is often considered to be intrusive
> - GitHub issues arent really used for simple questions or non-bug fixes
> - Gitter remains our most popular source of questions, which suffers a lot
> of the problems of Slack outlined in Wes' email
>
> We're considering discuss forums, specially after the large success of the
> PyTorch  and MXNet 
> forums for building community, allowing comfort in asking simple questions
> and being stored/googleable
>
> Dhruv
>
> On Thu, Jun 21, 2018 at 4:25 AM, Wes McKinney  wrote:
>
> > hi all,
> >
> > I wanted to bring up some concerns I have about the Slack room hosted
> > at http://apachearrow.slack.com.
> >
> > Corporate communications have changed a lot in recent years with the
> > new wave of IRC-like chat systems such as HipChat and Slack. In many
> > companies, Slack has become a preferred form of communication over
> > e-mail or other asynchronous messaging tools. This trend is negatively
> > impacting Apache Arrow in some ways that I will explain.
> >
> > Initially we created the Arrow Slack channel as a means of secondary
> > communication, to facilitate real-time discussions and help build the
> > community. So people, particularly newcomers, are coming to the
> > project and seeing 4 ways to communicate:
> >
> > * dev@ Mailing list
> > * JIRA
> > * GitHub
> > * Slack
> >
> > As a result of broader trends in the world, they are electing to use
> > Slack as their first, primary channel to interact with the project.
> > This is bad for many reasons:
> >
> > * Slack is essentially private. While anyone can join Slack, chats are
> > not archived in any public place, nor are they searchable through
> > internet search portals. I do not think it meets the public
> > communication requirements of Apache projects in general
> > * We've exceeded the message limit for free Slack channels; upgrading
> > to a paid Slack plan for Apache Arrow, with 650+ members, would be
> > very expensive
> > * Only 3 out of the top 20 Arrow contributors (by # of commits) are
> > regularly on the Slack channel. I don't use Slack, for example, and I
> > would rather not be expected to
> > * We are geo-distributed in many time zones; even if we all used
> > Slack, synchronous/real-time chat to discuss the project is frequently
> > impractical
> >
> > Because of the "real-time" nature of IRC-like systems, people's
> > discussions and questions get intermingled, so keeping track of
> > longer-running discussions may be difficult. It's hard to know when
> > someone's question has been answered or whether people have
> > sufficiently discussed a particular topic.
> >
> > Many discussions or questions are by their nature asynchronous, and it
> > may take 24-72 hours or more for Arrow contributors to make a
> > thoughtful reply.
> >
> > As a result of all of this, we are missing opportunities to have
> > deeper discussions, develop the Arrow roadmap, create new JIRAs to
> > capture bug reports or feature requests, and other activities of
> > healthy open source communities. Additionally, the private nature of
> > Slack is causing organizational knowledge (particularly Q / FAQs) to
> > essentially be lost. Users with questions won't stumble on answers by
> > searching on Google (as they would with a mailing list or
> > StackOverflow).
> >
> > I don't think Slack is necessarily bad for users in a corporate
> > environment; in many companies it is expected that all people will
> > have the Slack client open at all times. This isn't the case here,
> > though.
> >
> > My strong preference in light of the activity I have been observing on
> > Slack (which I encourage you to explore yourselves) would be to close
> > the channel and direct discussions or questions take place on the
> > mailing list, JIRA, or GitHub (all of which are archived on one or
> > more ASF mailing lists). Since migrating to Gitbox, we have enabled
> > GitHub issues on the repository, which has helped lower the barrier
> > for newcomers, but a large percentage of the time GitHub issues would
> > be better as JIRA issues or e-mails (which is what the GitHub issue
> > template says, alas).
> >
> > 

Re: [DISCUSS] Concerns about the Arrow Slack channel

2018-06-21 Thread Antoine Pitrou


Hi,

I didn't know we had a Slack channel.  I agree we shouldn't use
something that has poor or inexistent archival as a communication
channel.  The Discourse-based option(s) look better at least in that
regard, and probably also for categorization and navigation.

Regards

Antoine.


Le 21/06/2018 à 10:25, Wes McKinney a écrit :
> hi all,
> 
> I wanted to bring up some concerns I have about the Slack room hosted
> at http://apachearrow.slack.com.
> 
> Corporate communications have changed a lot in recent years with the
> new wave of IRC-like chat systems such as HipChat and Slack. In many
> companies, Slack has become a preferred form of communication over
> e-mail or other asynchronous messaging tools. This trend is negatively
> impacting Apache Arrow in some ways that I will explain.
> 
> Initially we created the Arrow Slack channel as a means of secondary
> communication, to facilitate real-time discussions and help build the
> community. So people, particularly newcomers, are coming to the
> project and seeing 4 ways to communicate:
> 
> * dev@ Mailing list
> * JIRA
> * GitHub
> * Slack
> 
> As a result of broader trends in the world, they are electing to use
> Slack as their first, primary channel to interact with the project.
> This is bad for many reasons:
> 
> * Slack is essentially private. While anyone can join Slack, chats are
> not archived in any public place, nor are they searchable through
> internet search portals. I do not think it meets the public
> communication requirements of Apache projects in general
> * We've exceeded the message limit for free Slack channels; upgrading
> to a paid Slack plan for Apache Arrow, with 650+ members, would be
> very expensive
> * Only 3 out of the top 20 Arrow contributors (by # of commits) are
> regularly on the Slack channel. I don't use Slack, for example, and I
> would rather not be expected to
> * We are geo-distributed in many time zones; even if we all used
> Slack, synchronous/real-time chat to discuss the project is frequently
> impractical
> 
> Because of the "real-time" nature of IRC-like systems, people's
> discussions and questions get intermingled, so keeping track of
> longer-running discussions may be difficult. It's hard to know when
> someone's question has been answered or whether people have
> sufficiently discussed a particular topic.
> 
> Many discussions or questions are by their nature asynchronous, and it
> may take 24-72 hours or more for Arrow contributors to make a
> thoughtful reply.
> 
> As a result of all of this, we are missing opportunities to have
> deeper discussions, develop the Arrow roadmap, create new JIRAs to
> capture bug reports or feature requests, and other activities of
> healthy open source communities. Additionally, the private nature of
> Slack is causing organizational knowledge (particularly Q / FAQs) to
> essentially be lost. Users with questions won't stumble on answers by
> searching on Google (as they would with a mailing list or
> StackOverflow).
> 
> I don't think Slack is necessarily bad for users in a corporate
> environment; in many companies it is expected that all people will
> have the Slack client open at all times. This isn't the case here,
> though.
> 
> My strong preference in light of the activity I have been observing on
> Slack (which I encourage you to explore yourselves) would be to close
> the channel and direct discussions or questions take place on the
> mailing list, JIRA, or GitHub (all of which are archived on one or
> more ASF mailing lists). Since migrating to Gitbox, we have enabled
> GitHub issues on the repository, which has helped lower the barrier
> for newcomers, but a large percentage of the time GitHub issues would
> be better as JIRA issues or e-mails (which is what the GitHub issue
> template says, alas).
> 
> Interested to hear the thoughts of others on this.
> 
> Thanks,
> Wes
> 


Re: [DISCUSS] Concerns about the Arrow Slack channel

2018-06-21 Thread Dhruv Madeka
Not to jump in too randomly, but for jupyter-widgets/bqplot
 - we haven't found an optimal
solution to this.

- The dev mailing list is often considered to be intrusive
- GitHub issues arent really used for simple questions or non-bug fixes
- Gitter remains our most popular source of questions, which suffers a lot
of the problems of Slack outlined in Wes' email

We're considering discuss forums, specially after the large success of the
PyTorch  and MXNet 
forums for building community, allowing comfort in asking simple questions
and being stored/googleable

Dhruv

On Thu, Jun 21, 2018 at 4:25 AM, Wes McKinney  wrote:

> hi all,
>
> I wanted to bring up some concerns I have about the Slack room hosted
> at http://apachearrow.slack.com.
>
> Corporate communications have changed a lot in recent years with the
> new wave of IRC-like chat systems such as HipChat and Slack. In many
> companies, Slack has become a preferred form of communication over
> e-mail or other asynchronous messaging tools. This trend is negatively
> impacting Apache Arrow in some ways that I will explain.
>
> Initially we created the Arrow Slack channel as a means of secondary
> communication, to facilitate real-time discussions and help build the
> community. So people, particularly newcomers, are coming to the
> project and seeing 4 ways to communicate:
>
> * dev@ Mailing list
> * JIRA
> * GitHub
> * Slack
>
> As a result of broader trends in the world, they are electing to use
> Slack as their first, primary channel to interact with the project.
> This is bad for many reasons:
>
> * Slack is essentially private. While anyone can join Slack, chats are
> not archived in any public place, nor are they searchable through
> internet search portals. I do not think it meets the public
> communication requirements of Apache projects in general
> * We've exceeded the message limit for free Slack channels; upgrading
> to a paid Slack plan for Apache Arrow, with 650+ members, would be
> very expensive
> * Only 3 out of the top 20 Arrow contributors (by # of commits) are
> regularly on the Slack channel. I don't use Slack, for example, and I
> would rather not be expected to
> * We are geo-distributed in many time zones; even if we all used
> Slack, synchronous/real-time chat to discuss the project is frequently
> impractical
>
> Because of the "real-time" nature of IRC-like systems, people's
> discussions and questions get intermingled, so keeping track of
> longer-running discussions may be difficult. It's hard to know when
> someone's question has been answered or whether people have
> sufficiently discussed a particular topic.
>
> Many discussions or questions are by their nature asynchronous, and it
> may take 24-72 hours or more for Arrow contributors to make a
> thoughtful reply.
>
> As a result of all of this, we are missing opportunities to have
> deeper discussions, develop the Arrow roadmap, create new JIRAs to
> capture bug reports or feature requests, and other activities of
> healthy open source communities. Additionally, the private nature of
> Slack is causing organizational knowledge (particularly Q / FAQs) to
> essentially be lost. Users with questions won't stumble on answers by
> searching on Google (as they would with a mailing list or
> StackOverflow).
>
> I don't think Slack is necessarily bad for users in a corporate
> environment; in many companies it is expected that all people will
> have the Slack client open at all times. This isn't the case here,
> though.
>
> My strong preference in light of the activity I have been observing on
> Slack (which I encourage you to explore yourselves) would be to close
> the channel and direct discussions or questions take place on the
> mailing list, JIRA, or GitHub (all of which are archived on one or
> more ASF mailing lists). Since migrating to Gitbox, we have enabled
> GitHub issues on the repository, which has helped lower the barrier
> for newcomers, but a large percentage of the time GitHub issues would
> be better as JIRA issues or e-mails (which is what the GitHub issue
> template says, alas).
>
> Interested to hear the thoughts of others on this.
>
> Thanks,
> Wes
>


[jira] [Created] (ARROW-2729) [GLib] Add decimal128 array builder

2018-06-21 Thread yosuke shiro (JIRA)
yosuke shiro created ARROW-2729:
---

 Summary: [GLib] Add decimal128 array builder
 Key: ARROW-2729
 URL: https://issues.apache.org/jira/browse/ARROW-2729
 Project: Apache Arrow
  Issue Type: New Feature
  Components: GLib
Reporter: yosuke shiro


Support Decimal128Array and DecimalType.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Concerns about the Arrow Slack channel

2018-06-21 Thread Li Jin
Wes thanks for bringing this up.

+1 from me. I rarely use Slack and mostly use the mailing list for
communication (for the similar reason that Wes mentioned).

Li

On Thu, Jun 21, 2018 at 4:25 AM, Wes McKinney  wrote:

> hi all,
>
> I wanted to bring up some concerns I have about the Slack room hosted
> at http://apachearrow.slack.com.
>
> Corporate communications have changed a lot in recent years with the
> new wave of IRC-like chat systems such as HipChat and Slack. In many
> companies, Slack has become a preferred form of communication over
> e-mail or other asynchronous messaging tools. This trend is negatively
> impacting Apache Arrow in some ways that I will explain.
>
> Initially we created the Arrow Slack channel as a means of secondary
> communication, to facilitate real-time discussions and help build the
> community. So people, particularly newcomers, are coming to the
> project and seeing 4 ways to communicate:
>
> * dev@ Mailing list
> * JIRA
> * GitHub
> * Slack
>
> As a result of broader trends in the world, they are electing to use
> Slack as their first, primary channel to interact with the project.
> This is bad for many reasons:
>
> * Slack is essentially private. While anyone can join Slack, chats are
> not archived in any public place, nor are they searchable through
> internet search portals. I do not think it meets the public
> communication requirements of Apache projects in general
> * We've exceeded the message limit for free Slack channels; upgrading
> to a paid Slack plan for Apache Arrow, with 650+ members, would be
> very expensive
> * Only 3 out of the top 20 Arrow contributors (by # of commits) are
> regularly on the Slack channel. I don't use Slack, for example, and I
> would rather not be expected to
> * We are geo-distributed in many time zones; even if we all used
> Slack, synchronous/real-time chat to discuss the project is frequently
> impractical
>
> Because of the "real-time" nature of IRC-like systems, people's
> discussions and questions get intermingled, so keeping track of
> longer-running discussions may be difficult. It's hard to know when
> someone's question has been answered or whether people have
> sufficiently discussed a particular topic.
>
> Many discussions or questions are by their nature asynchronous, and it
> may take 24-72 hours or more for Arrow contributors to make a
> thoughtful reply.
>
> As a result of all of this, we are missing opportunities to have
> deeper discussions, develop the Arrow roadmap, create new JIRAs to
> capture bug reports or feature requests, and other activities of
> healthy open source communities. Additionally, the private nature of
> Slack is causing organizational knowledge (particularly Q / FAQs) to
> essentially be lost. Users with questions won't stumble on answers by
> searching on Google (as they would with a mailing list or
> StackOverflow).
>
> I don't think Slack is necessarily bad for users in a corporate
> environment; in many companies it is expected that all people will
> have the Slack client open at all times. This isn't the case here,
> though.
>
> My strong preference in light of the activity I have been observing on
> Slack (which I encourage you to explore yourselves) would be to close
> the channel and direct discussions or questions take place on the
> mailing list, JIRA, or GitHub (all of which are archived on one or
> more ASF mailing lists). Since migrating to Gitbox, we have enabled
> GitHub issues on the repository, which has helped lower the barrier
> for newcomers, but a large percentage of the time GitHub issues would
> be better as JIRA issues or e-mails (which is what the GitHub issue
> template says, alas).
>
> Interested to hear the thoughts of others on this.
>
> Thanks,
> Wes
>


Re: Running Apache Arrow on Visual Studio

2018-06-21 Thread diam5752



On 2018/06/21 12:46:44, Wes McKinney  wrote: 
> hi,
> 
> I think you need to add the .lib files for linking; adding the include
> directories alone won't do it. The DLL files must be in your %PATH% at
> runtime.
> 
> - Wes
> 
> On Thu, Jun 21, 2018 at 8:22 AM, diam5...@gmail.com  
> wrote:
> > Hello. I have installed apache arrow C++ API on windows 10 , and all the 
> > tests are running fine.
> >
> > But when I am trying to make my own c++ program using apache arrow I get 
> > this error :
> >
> >
> > “Severity Code Description Project File 
> > LineSuppression State
> > Error  LNK2001   unresolved external symbol 
> > "__declspec(dllimport) class arrow::MemoryPool * __cdecl 
> > arrow::default_memory_pool(void)" 
> > (__imp_?default_memory_pool@arrow@@YAPEAVMemoryPool@1@XZ)
> > test   C:\Users\t_aggelosd\Desktop\test\test\test.obj 1  “
> >
> > I think I have included all the additional include directories in in visual 
> > studio. Have I done something wrong ?
> >
> >
> > This is my code:
> > “
> > #include "stdafx.h"
> > #include 
> > #include 
> > #include 
> > #include 
> > #include "arrow/util/visibility.h"
> > #include "arrow/memory_pool.h"
> > #include "arrow/array.h"
> > #include "arrow/allocator.h"
> > #include 
> > #include 
> >
> > using arrow::Int64Builder;
> >
> > int main()
> > {
> > arrow::default_memory_pool();
> > return 0;
> > }
> > ”
> >
> >
> >
> > Thanx for your time.
> Thank you very much, I will try it.


Re: Running Apache Arrow on Visual Studio

2018-06-21 Thread Wes McKinney
hi,

I think you need to add the .lib files for linking; adding the include
directories alone won't do it. The DLL files must be in your %PATH% at
runtime.

- Wes

On Thu, Jun 21, 2018 at 8:22 AM, diam5...@gmail.com  wrote:
> Hello. I have installed apache arrow C++ API on windows 10 , and all the 
> tests are running fine.
>
> But when I am trying to make my own c++ program using apache arrow I get this 
> error :
>
>
> “Severity Code Description Project File Line  
>   Suppression State
> Error  LNK2001   unresolved external symbol 
> "__declspec(dllimport) class arrow::MemoryPool * __cdecl 
> arrow::default_memory_pool(void)" 
> (__imp_?default_memory_pool@arrow@@YAPEAVMemoryPool@1@XZ)test 
>   C:\Users\t_aggelosd\Desktop\test\test\test.obj 1  “
>
> I think I have included all the additional include directories in in visual 
> studio. Have I done something wrong ?
>
>
> This is my code:
> “
> #include "stdafx.h"
> #include 
> #include 
> #include 
> #include 
> #include "arrow/util/visibility.h"
> #include "arrow/memory_pool.h"
> #include "arrow/array.h"
> #include "arrow/allocator.h"
> #include 
> #include 
>
> using arrow::Int64Builder;
>
> int main()
> {
> arrow::default_memory_pool();
> return 0;
> }
> ”
>
>
>
> Thanx for your time.


Running Apache Arrow on Visual Studio

2018-06-21 Thread diam5752
Hello. I have installed apache arrow C++ API on windows 10 , and all the tests 
are running fine.

But when I am trying to make my own c++ program using apache arrow I get this 
error :


“Severity Code Description Project File Line
Suppression State
Error  LNK2001   unresolved external symbol 
"__declspec(dllimport) class arrow::MemoryPool * __cdecl 
arrow::default_memory_pool(void)" 
(__imp_?default_memory_pool@arrow@@YAPEAVMemoryPool@1@XZ)test   
C:\Users\t_aggelosd\Desktop\test\test\test.obj 1  “

I think I have included all the additional include directories in in visual 
studio. Have I done something wrong ?


This is my code:
“
#include "stdafx.h"
#include 
#include 
#include 
#include 
#include "arrow/util/visibility.h"
#include "arrow/memory_pool.h"
#include "arrow/array.h"
#include "arrow/allocator.h"
#include 
#include 
 
using arrow::Int64Builder;

int main()
{
arrow::default_memory_pool();
return 0;
} 
”

 

Thanx for your time.


[jira] [Created] (ARROW-2728) Pyarrow not adding partition columns when given a glob path

2018-06-21 Thread pranav kohli (JIRA)
pranav kohli created ARROW-2728:
---

 Summary: Pyarrow not adding partition columns when given a glob 
path
 Key: ARROW-2728
 URL: https://issues.apache.org/jira/browse/ARROW-2728
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Affects Versions: 0.9.0
 Environment: pyarrow : 0.9.0.post1
dask : 0.17.1
Mac OS
Reporter: pranav kohli


I am saving a dask dataframe to parquet with two partition columns using the 
pyarrow engine. The problem arises in scanning the partition columns. When I 
scan using the directory path, I get the partition columns in the output 
dataframe, whereas if I scan using the glob path, I dont get these columns



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[DISCUSS] Concerns about the Arrow Slack channel

2018-06-21 Thread Wes McKinney
hi all,

I wanted to bring up some concerns I have about the Slack room hosted
at http://apachearrow.slack.com.

Corporate communications have changed a lot in recent years with the
new wave of IRC-like chat systems such as HipChat and Slack. In many
companies, Slack has become a preferred form of communication over
e-mail or other asynchronous messaging tools. This trend is negatively
impacting Apache Arrow in some ways that I will explain.

Initially we created the Arrow Slack channel as a means of secondary
communication, to facilitate real-time discussions and help build the
community. So people, particularly newcomers, are coming to the
project and seeing 4 ways to communicate:

* dev@ Mailing list
* JIRA
* GitHub
* Slack

As a result of broader trends in the world, they are electing to use
Slack as their first, primary channel to interact with the project.
This is bad for many reasons:

* Slack is essentially private. While anyone can join Slack, chats are
not archived in any public place, nor are they searchable through
internet search portals. I do not think it meets the public
communication requirements of Apache projects in general
* We've exceeded the message limit for free Slack channels; upgrading
to a paid Slack plan for Apache Arrow, with 650+ members, would be
very expensive
* Only 3 out of the top 20 Arrow contributors (by # of commits) are
regularly on the Slack channel. I don't use Slack, for example, and I
would rather not be expected to
* We are geo-distributed in many time zones; even if we all used
Slack, synchronous/real-time chat to discuss the project is frequently
impractical

Because of the "real-time" nature of IRC-like systems, people's
discussions and questions get intermingled, so keeping track of
longer-running discussions may be difficult. It's hard to know when
someone's question has been answered or whether people have
sufficiently discussed a particular topic.

Many discussions or questions are by their nature asynchronous, and it
may take 24-72 hours or more for Arrow contributors to make a
thoughtful reply.

As a result of all of this, we are missing opportunities to have
deeper discussions, develop the Arrow roadmap, create new JIRAs to
capture bug reports or feature requests, and other activities of
healthy open source communities. Additionally, the private nature of
Slack is causing organizational knowledge (particularly Q / FAQs) to
essentially be lost. Users with questions won't stumble on answers by
searching on Google (as they would with a mailing list or
StackOverflow).

I don't think Slack is necessarily bad for users in a corporate
environment; in many companies it is expected that all people will
have the Slack client open at all times. This isn't the case here,
though.

My strong preference in light of the activity I have been observing on
Slack (which I encourage you to explore yourselves) would be to close
the channel and direct discussions or questions take place on the
mailing list, JIRA, or GitHub (all of which are archived on one or
more ASF mailing lists). Since migrating to Gitbox, we have enabled
GitHub issues on the repository, which has helped lower the barrier
for newcomers, but a large percentage of the time GitHub issues would
be better as JIRA issues or e-mails (which is what the GitHub issue
template says, alas).

Interested to hear the thoughts of others on this.

Thanks,
Wes


[jira] [Created] (ARROW-2727) Unable to build java module

2018-06-21 Thread Jeff Zhang (JIRA)
Jeff Zhang created ARROW-2727:
-

 Summary: Unable to build java module
 Key: ARROW-2727
 URL: https://issues.apache.org/jira/browse/ARROW-2727
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: Jeff Zhang


Due to pom issue.

{code}
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)