Pearu Peterson created ARROW-2944:
-
Summary: Arrow format documentation mentions VectorLayout that
does not exist anymore
Key: ARROW-2944
URL: https://issues.apache.org/jira/browse/ARROW-2944
Wes McKinney created ARROW-2943:
---
Summary: [C++] Implement BufferedOutputStream::Flush
Key: ARROW-2943
URL: https://issues.apache.org/jira/browse/ARROW-2943
Project: Apache Arrow
Issue Type:
> I would like to point out that arrow's use of orc is a great example of how
> it would be possible to manage parquet-cpp as a separate codebase. That gives
> me hope that the projects could be managed separately some day.
Well, I don't know that ORC is the best example. The ORC C++ codebase
You're point about the constraints of the ASF release process are well
taken and as a developer who's trying to work in the current environment I
would be much happier if the codebases were merged. The main issues I worry
about when you put codebases like these together are:
1. The delineation of
hi Josh,
> I can imagine use cases for parquet that don't involve arrow and tying them
> together seems like the wrong choice.
Apache is "Community over Code"; right now it's the same people
building these projects -- my argument (which I think you agree with?)
is that we should work more
I recently worked on an issue that had to be implemented in parquet-cpp
(ARROW-1644, ARROW-1599) but required changes in arrow (ARROW-2585,
ARROW-2586). I found the circular dependencies confusing and hard to work
with. For example, I still have a PR open in parquet-cpp (created on May
10) because
On Mon, Jul 30, 2018 at 8:50 PM, Ted Dunning wrote:
> On Mon, Jul 30, 2018 at 5:39 PM Wes McKinney wrote:
>
>>
>> > The community will be less willing to accept large
>> > changes that require multiple rounds of patches for stability and API
>> > convergence. Our contributions to Libhdfs++ in
On Mon, Jul 30, 2018 at 5:39 PM Wes McKinney wrote:
>
> > The community will be less willing to accept large
> > changes that require multiple rounds of patches for stability and API
> > convergence. Our contributions to Libhdfs++ in the HDFS community took a
> > significantly long time for the
hi,
On Mon, Jul 30, 2018 at 6:52 PM, Deepak Majeti wrote:
> Wes,
>
> I definitely appreciate and do see the impact of contributions made by
> everyone. I made this statement not to rate any contributions but solely to
> support my concern.
> The contribution barrier is higher simply because of
Phillip Cloud created ARROW-2942:
Summary: [Packaging] Allow a user to inspect the status of another
user's builds
Key: ARROW-2942
URL: https://issues.apache.org/jira/browse/ARROW-2942
Project:
Phillip Cloud created ARROW-2941:
Summary: [Packaging] Allow a user to kill existing builds
Key: ARROW-2941
URL: https://issues.apache.org/jira/browse/ARROW-2941
Project: Apache Arrow
Issue
I'm not going to comment on the design of the parquet-cpp module and whether it
is “closer” to parquet or arrow.
But I do think Wes’s proposal is consistent with Apache policy. PMCs make
releases and govern communities; they don’t exist to manage code bases, except
as a means to the end of
Wes,
I definitely appreciate and do see the impact of contributions made by
everyone. I made this statement not to rate any contributions but solely to
support my concern.
The contribution barrier is higher simply because of the increased code,
build, and test dependencies. If the community has
Philipp Moritz created ARROW-2940:
-
Summary: [Python] Import error with pytorch 0.3
Key: ARROW-2940
URL: https://issues.apache.org/jira/browse/ARROW-2940
Project: Apache Arrow
Issue Type:
hi Deepak
On Mon, Jul 30, 2018 at 5:18 PM, Deepak Majeti wrote:
> @Wes
> My observation is that most of the parquet-cpp contributors you listed that
> overlap with the Arrow community mainly contribute to the Arrow
> bindings(parquet::arrow layer)/platform API changes in the parquet-cpp
> repo.
Hi Richard,
Take a look at this JIRA https://issues.apache.org/jira/browse/SPARK-24579,
it is geared towards exporting Spark data to DL frameworks, but it's likely
to add a general method to map Spark data partitions to a function using
Arrow data. In that function you should be able apply
@Wes
My observation is that most of the parquet-cpp contributors you listed that
overlap with the Arrow community mainly contribute to the Arrow
bindings(parquet::arrow layer)/platform API changes in the parquet-cpp
repo. Very few of them review/contribute patches to the parquet-cpp core.
I
Ian Robertson created ARROW-2939:
Summary: [Python] API documentation version doesn't match latest
on PyPI
Key: ARROW-2939
URL: https://issues.apache.org/jira/browse/ARROW-2939
Project: Apache Arrow
Krisztian Szucs created ARROW-2938:
--
Summary: [Packaging] Make the source release via crossbow
Key: ARROW-2938
URL: https://issues.apache.org/jira/browse/ARROW-2938
Project: Apache Arrow
Hi Arrow devs,
I am trying to separate reading only pageHeaders from reading
(reading+uncompresing+serializing) its entire content.
The current SerializedPageReader::NextPage() does both things at the same
time.
I tried importing format::PageHeader into a separate project linking
against a build
Sounds good. I will start cranking on this later today and provide an
update tomorrow morning about any progress or issues that arise.
On Mon, Jul 30, 2018 at 11:05 AM Wes McKinney wrote:
> hey Phillip,
>
> I think this is getting too complicated and it's going to hold up the
> release more
hey Phillip,
I think this is getting too complicated and it's going to hold up the
release more than it already has. How about we cut 0.10.0 binaries
based on the git tag and we try for using a signed tarball for 0.11?
I'm concerned we're going to miss our window this week to get an RC
cut and
Wes McKinney created ARROW-2937:
---
Summary: [Java] Follow-up changes to ARROW-2704
Key: ARROW-2937
URL: https://issues.apache.org/jira/browse/ARROW-2937
Project: Apache Arrow
Issue Type:
Wanted to update everyone here regarding the ability to cut a release
candidate for 0.10.0.
The last remaining set of tasks is to be able to use the new packaging tool
(crossbow.py) to build binary artifacts from a source archive. What this
means is that we'll have to move the release scripts
Le 30/07/2018 à 10:50, Antoine Pitrou a écrit :
>
> Hi Wes,
>
> Le 29/07/2018 à 01:44, Wes McKinney a écrit :
>> I believe the best way to remedy the situation is to adopt a
>> "Community over Code" approach and find a way for the Parquet and
>> Arrow C++ development communities to operate out
Hi Wes,
Le 29/07/2018 à 01:44, Wes McKinney a écrit :
> I believe the best way to remedy the situation is to adopt a
> "Community over Code" approach and find a way for the Parquet and
> Arrow C++ development communities to operate out of the same code
> repository, i.e. the apache/arrow git
26 matches
Mail list logo