Re: [VOTE] Move Arrow DataFusion Subproject to new Top Level Apache Project

2024-03-03 Thread Julian Hyde
+1 (binding) > On Mar 2, 2024, at 2:28 PM, Dewey Dunnington > wrote: > > +1 (binding) > >> On Sat, Mar 2, 2024 at 8:08 AM vin jake wrote: >> >> +1 (binding) >> >>> On Fri, Mar 1, 2024 at 7:33 PM Andrew Lamb wrote: >>> >>> Hello, >>> >>> As we have discussed[1][2] I would like to vote

Re: [DISCUSS] Move sqlparser-rs back into DataFusion project?

2024-02-26 Thread Julian Hyde
I am torn on this. One one hand, I am a big fan of components that are standalone - have no more dependencies than necessary, and are self-evidently standalone. So, I think that re-absorbing sqlparser-rs back into DataFusion would not be a good step. It would reduce the perception that it is

Re: [DISCUSS] Status and future of @ApacheArrow Twitter account

2024-01-29 Thread Julian Hyde
The easiest thing is to share the Twitter credentials with any PMC member who is interested in sending tweets (which is usually a very small number). To answer Antoine’s point. I have found Twitter an extremely effective way for an open-source project to communicate with the “exo-community” —

Re: [DISCUSS] Linear Formula Types

2024-01-07 Thread Julian Hyde
If the DB layer above Arrow supports it, I would define a (non-stored) calculated column. Given celsius_percent between 0 and 1, I would define fahrenheit as (32 + celsius_percent * 1.8). A good query optimizer would convert the condition 'where fahrenheit > 122' into 'where celsius_percent >

Re: Call for Presentations, Community Over Code 2023

2023-05-10 Thread Julian Hyde
Follow the unsubscribe instructions: https://arrow.apache.org/community/#mailing-lists On Wed, May 10, 2023 at 10:27 AM Gil Cohen wrote: > > How do I unsubscribe? > > On Wed, 10 May 2023 at 0:36 Rich Bowen wrote: > > > (Note: You are receiving this because you are subscribed to the dev@ > >

Re: Apache Arrow PGP Key

2023-03-20 Thread Julian Hyde
I think we’re confusing two concepts: signing each others’ keys and adding them to the KEYS file. It is reasonable that we, as a community, extend the web of trust by mutual signing. Let’s suppose Wes and I have signed each other’s keys. Someone from the Pandas community, who knows Wes,

Re: Pre-release feedback for 'nanoarrow'

2023-01-02 Thread Julian Hyde
Can you make sure that it adheres to ASF branding guidelines? As an ASF project its name should be “Apache Nanoarrow” and it should define itself in terms of its relationship with “Apache Arrow”. Julian > On Jan 2, 2023, at 8:28 AM, Dewey Dunnington > wrote: > > Hi all, > > Following a

Re: Parser for ExecPlans

2022-11-03 Thread Julian Hyde
When people design a language to represent a data structure, they often do a poor job with literals (i.e. the constant values for each data type). And that causes problems with operator overloading. I recommend that you give each data type its own literal format, so you can distinguish, say, a

Re: [RUST][Go][proposal] Arrow Intermediate Representation to facilitate the transformation of row-oriented data sources into Arrow columnar representation

2022-07-28 Thread Julian Hyde
If the 'row-oriented format' is an API rather than a physical data representation then it can be implemented via coroutines and could therefore have less scattered patterns of read/write access. By 'coroutines' I'm being rather imprecise, but I hope you get the general idea. An asynchronous API

Re: [Rust] [DataFusion] Discuss moving Python bindings back to Apache Arrow

2022-07-15 Thread Julian Hyde
Have significant changes been made since January? If not, IP clearance may not be required. The code as of January is still kosher Arrow IP, even if it’s been deleted from git. Julian > On Jul 15, 2022, at 7:02 PM, Andy Grove wrote: > > datafusion-python was donated to the Apache Arrow

Re: Apache Arrow and "Native"-themed mascotry

2022-07-10 Thread Julian Hyde
Walter, I am very sympathetic to your concerns about the Apache brand but your email contains an untrue statement that needs to be corrected. You say that "Arrow’s project name is an unfortunate example of ASF’s stereotyping", but stereotyping implies intent: that the people who named the Arrow

Re: Adding Apache Arrow to the registry of Digital Public Goods

2022-03-25 Thread Julian Hyde
Good idea. I posted on ComDev. It would be interesting to know what Fineract's experience was, and whether other projects have considered this. Julian [1] https://lists.apache.org/thread/kgg6ml3n5ddr4ndbhnsfxc4ynn41djss On Fri, Mar 25, 2022 at 10:08 AM Wes McKinney wrote: > > As some research

Re: [FlightSQL] Higher-level facade API to increase adoption/audience? Or does this belong as a personal project

2022-03-15 Thread Julian Hyde
being contributed already. Which as you noted there is power in >>>> standards, so I expect this avenue to see heavy use. >>>> 2. For clients that can handle it and want to go through the trouble, >>>> consuming the data directly as Arrow for efficiency purpose

Re: [FlightSQL] Higher-level facade API to increase adoption/audience? Or does this belong as a personal project

2022-03-14 Thread Julian Hyde
When I read “language-agnostic standard for data access” I cringed a little. (See [1].) Sure, it’s fun to create a new standard. But if your standard is successful, there will need to be a huge amount of work changing existing code to use your standard. That effort might even be difference

Re: Managing usage of the @ApacheArrow Twitter handle and other social media

2022-02-01 Thread Julian Hyde
In my opinion, any PMC member should be allowed to use the Twitter account without any other checks, balances, or friction. They know that they are speaking for the project, and only for the project. They are PMC members so we trust them to do the right thing. If committers and other non-PMC

Re: [DISCUSS] Deprecate user@ in favor for github issues/discussions

2021-09-29 Thread Julian Hyde
I'm not for or against this proposal. I took a few minutes to browse the archives [1]. It seems to me that the user@ list is working extremely well. People get answers quickly, problems are converted into JIRA cases, and the discussion often references existing information sources. I want to

Re: Temporal Arithmetic

2021-09-23 Thread Julian Hyde
I wouldn’t discuss the algorithm on this list. I’d just commit to being compatible with Postgres, and write a bunch of tests based on Postgres’ observed behavior. > On Sep 23, 2021, at 5:12 AM, Phillip Cloud wrote: > > Hi all, > > I wanted to draw some attention to ARROW-11090 [1] in an

Re: [DISCUSS] Developing an "Arrow Compute IR [Intermediate Representation]" to decouple language front ends from Arrow-native compute engines

2021-08-12 Thread Julian Hyde
cooks" by setting up the ComputeIR project somewhere >>>>> separate from the format/ directory to permit it to exist in a >>>>> Work-In-Progress status for a period of time until we work through the >>>>> various details and design concerns. &

Re: [DISCUSS] Developing an "Arrow Compute IR [Intermediate Representation]" to decouple language front ends from Arrow-native compute engines

2021-08-05 Thread Julian Hyde
Wes, Thanks for this. I’ve added comments to the doc and to the PR. The biggest surprise is that this language does full relational operations. I was expecting that it would do fragments of the operations. Consider join. A distributed hybrid hash join needs to partition rows into output

Re: [STRAW POLL] (How) should Arrow define storage for "Instant"s

2021-06-28 Thread Julian Hyde
D (2nd choice E if we’re doing ranked-choice voting) Julian > On Jun 24, 2021, at 12:24 PM, Weston Pace wrote: > > The discussion in [1] led to the following question. Before we > proceed on a vote it was decided we should do a straw poll to settle > on an approach (which can then be voted

Re: [Discuss] If and how we should integrate geospatial data (specs) in Arrow

2021-06-25 Thread Julian Hyde
Cc += geospatial@. I think allowing WKB and WKT is sufficient. Perhaps Geometry could be a composite type (WKT, SRID) or (WKB, SRID). SRID (spatial reference identifier) is almost always needed to qualify a geometry value. It is analogous to how TimeZone is needed (implicitly or explicitly) to

Re: [VOTE] Clarify meaning of timestamp without time zone to equal the concept of "LocalDateTime"

2021-06-25 Thread Julian Hyde
+1 > On Jun 25, 2021, at 10:36 AM, Antoine Pitrou wrote: > > > Le 24/06/2021 à 21:16, Weston Pace a écrit : >> The discussion in [1] led to the following proposal which I would like >> to submit for a vote. >> --- >> Arrow allows a timestamp column to omit the time zone property. This >> has

Re: [Format][Important] Needed clarification of timezone-less timestamps

2021-06-22 Thread Julian Hyde
t time zone > * > https://docs.google.com/document/d/1wDAuxEDVo3YxZx20fGUGqQxi3aoss7TJ-TzOUjaoZk8/edit?usp=sharing > > # Proposal: Arrow should define how an “Instant” is stored > * > https://docs.google.com/document/d/1xEKRhs-GUSMwjMhgmQdnCNMXwZrA10226AcXRoP8g9E/edit?usp=sharing >

Re: [Format][Important] Needed clarification of timezone-less timestamps

2021-06-22 Thread Julian Hyde
My proposal is that Arrow should support three different kinds of date-times: zoneless, zoned, and instant. (Not necessarily with those names.) All three kinds occur frequently in the industry. Many systems only have two, and users of those systems have figured out how to make do. (For

Re: [Format] Timestamp timezone semantics?

2021-06-04 Thread Julian Hyde
e objects — if you call access > datetime.hour on a timezone-less datetime.datetime, it will return the > same result no matter where in the world you are. > > On Thu, Jun 3, 2021 at 1:19 PM Julian Hyde wrote: >> >> It seems that Arrow’s timestamp type can either have no t

Re: [Format] Timestamp timezone semantics?

2021-06-03 Thread Julian Hyde
It seems that Arrow’s timestamp type can either have no time zone or be UTC. I think that is a flawed design, because doesn’t catch user errors. Suppose you want to find the number of milliseconds between two timestamps. If the first has a timezone and the second is implicitly UTC, then you can

Re: [Format] Timestamp timezone semantics?

2021-06-03 Thread Julian Hyde
My answer to Antoine’s question would not be “kind of”, it would be “no”. In a system such as Joda-time, which I claim is the only system that Arrow should be considering, a timestamp-without-timezone does not have an implicit time zone of UTC. It has no time zone. > On Jun 3, 2021, at 8:52

Re: [Format] Timestamp timezone semantics?

2021-06-02 Thread Julian Hyde
> On Jun 2, 2021, at 1:56 PM, Micah Kornfield wrote: > > > At least in bigquery we do the following mapping: > SQL TIMESTAMP -> Arrow Timestamp with "UTC" timezone > SQL DATETIME -> Arrow Timestamp without a time-zone. BigQuery was one of the systems I had in mind when I said "naming is a

Re: [Format] Timestamp timezone semantics?

2021-06-02 Thread Julian Hyde
Good time libraries support all. E.g. Jodatime [1] has * Instant - an instantaneous point on the time-line * DateTime - full date and time with time-zone * LocalDateTime - date-time without a time-zone The SQL world isn't quite as much of a mess as Adam makes it out to be. The SQL standard

Re: Long title on github page

2021-05-17 Thread Julian Hyde
rowth. We have a data format, yes, >> but we are also creating a computational platform to go hand-in-hand >> with the data format to make it easier to build fast applications that >> use the data format. So the description needs to capture both of these >> ideas. >> >>

Re: Long title on github page

2021-05-17 Thread Julian Hyde
I think that the “cross-language development platform for” is noise. (I’m sure that JPEG developers think that JPEG is a “cross-language development platform” too. But it isn’t. It is an image format.) "Apache Arrow is data format for efficient in-memory processing.” I’ll note that In

Re: [C++][DISCUSS] Implementing interpreted (non-compiled) tests for compute functions

2021-05-14 Thread Julian Hyde
Do these any of these compute functions have analogs in other implementations of Arrow (e.g. Rust)? I believe that as much as possible of Arrow’s compute functionality should be cross-language. Perhaps there are language-specific differences in how functions are invoked, but the basic

Re: [Discuss] Storing metadata about the "sortedness" of data

2021-05-11 Thread Julian Hyde
Note that Calcite’s Statistic interface is heavily simplified, designed to be really simple for people to implement when they write their first table adapter. There are more advanced forms of metadata, such as RelMdDistribution [1] and Collation [2]. Since Arrow data sets will typically

Re: [RUST] Proposal for more frequent Rust Arrow release process

2021-05-03 Thread Julian Hyde
ed > > > > tarball, seems like it should take ~5 minutes or less with the proper > > > > tooling? Verification is more heavy weight but again with the proper > > > > tooling and a good system for testing out more changes, it does not > > seem > > &

Re: [DISCUSS] Host DataFusion website on GitHub pages

2021-05-03 Thread Julian Hyde
M, Wes McKinney wrote: > > What would be the advantages of this versus publishing a website to > arrow.apache.org/datafusion? If the project is actually part of Apache > Arrow, I would be worried about having different base URLs altogether > for different subprojects > > On

Re: [DISCUSS] Host DataFusion website on GitHub pages

2021-05-03 Thread Julian Hyde
Would this web site be served from an apache.org domain? > On May 3, 2021, at 7:34 AM, Andy Grove wrote: > > Based on a quick reading of ASF documentation, I don't think we need to > vote on creating a website, but I do think that the user guide should be > published from

Re: [RUST] Proposal for more frequent Rust Arrow release process

2021-05-01 Thread Julian Hyde
st for at least 72 hours this does seem like a lot of > overhead every two weeks, but it seems that is something for Rust > maintainers to decide and adjust. > > -Micah > > On Saturday, May 1, 2021, Julian Hyde wrote: > > > (Removing user@ from cc. I think this is mai

Re: [RUST] Proposal for more frequent Rust Arrow release process

2021-05-01 Thread Julian Hyde
(Removing user@ from cc. I think this is mainly a dev@ issue.) I believe there are some tensions between this process and the Apache process. In particular, Apache releases tend to be a signed source distribution (tarball) that at least three PMC members download and verify. I totally understand

Re: [Gandiva] Replacing the LRU cache in gandiva

2021-04-21 Thread Julian Hyde
e: > > Julian, How do you plan to use Gandiva in Apache Calcite? > > On Tue, Apr 20, 2021 at 9:57 PM Julian Hyde wrote: > >> We would love to use Gandiva in Apache Calcite [1] but we are blocked >> because the JAR on Maven Central doesn't work on macOS, Linux or >>

Re: [Gandiva] Replacing the LRU cache in gandiva

2021-04-20 Thread Julian Hyde
We would love to use Gandiva in Apache Calcite [1] but we are blocked because the JAR on Maven Central doesn't work on macOS, Linux or Windows [2] and there seems to be no interest in fixing the problem. So I doubt whether anyone is using Gandiva in production (unless they have built the

Re: Rust sync meeting

2021-04-08 Thread Julian Hyde
; > > > to releasing. > > > > > > > > > > The discussions that we have had recently have centered around > > > > > communication questions. > > > > > > > > > > * Mailing list discussions serve to raise awareness around matte

Re: Rust sync meeting

2021-04-08 Thread Julian Hyde
Antoine, I need to correct your assertion > we develop on the side every day when we submit PRs from forks; > it's just a matter of how much complexity is being submitted at once Intuitively, there seems to be a continuum between a PR developed within a project to a major feature/codebase

Re: [Rust] Contributing to Apache Arrow

2021-03-07 Thread Julian Hyde
We have the exact same problem in Apache Calcite. People get the impression that “contributor” is some kind of achievement within the Apache hierarchy - it’s not, it’s just a JIRA concept - and it creates friction for people who want to contribute. (After all, I think we want people to log a

Arrow papers

2021-02-07 Thread Julian Hyde
A couple of interesting Arrow-related papers have appeared at conferences recently: Integrating Lightweight Compression Capabilities into Apache Arrow [1] Magpie: Python at Speed and Scale using Cloud Backends [2] I’m sharing them so that people are aware of the evolving state-of-the-art.

Re: [DISCUSS] Rotating the PMC Chair

2020-09-29 Thread Julian Hyde
in > matters, but overall IMHO we've had a generally healthy dynamic in our > governance. > > On Tue, Sep 29, 2020 at 2:12 AM Julian Hyde wrote: >> >> There has been some discussion in the Arrow PMC about rotating the PMC >> Chair (also known as the project VP) every ye

Re: [Rust] Arrow SQL Adapters/Connectors

2020-09-29 Thread Julian Hyde
ODBC and JDBC do not specify a wire protocol. So, while the client APIs are definitely row-based, any particular driver could use a protocol that is based on Arrow data. There is immense investment in ODBC and JDBC drivers, and they handle complex cases such as connection pooling, statement

[DISCUSS] Rotating the PMC Chair

2020-09-29 Thread Julian Hyde
There has been some discussion in the Arrow PMC about rotating the PMC Chair (also known as the project VP) every year. I wanted to raise the topic here for discussion among Arrow committers and within the broader Arrow community. Quite a few Apache projects have adopted a policy where they

Re: Some interesting VLDB reading on vectorized query evaluation relevant to Gandiva, other items

2018-09-28 Thread Julian Hyde
An excellent paper, thanks for sharing. (It’s worth reading every single one of the references.) I wonder whether Timo Kersten is related to Martin. > On Sep 27, 2018, at 9:44 AM, Wes McKinney wrote: > > http://www.vldb.org/pvldb/vol11/p2209-kersten.pdf

Re: [DISCUSS] Standardize Java style

2018-08-28 Thread Julian Hyde
My two cents: it’s much, much more important to have a standard style (enforced automatically) than what that style is. People should come into this expecting to compromise their personal preferences. > On Aug 28, 2018, at 10:29 AM, Bryan Cutler wrote: > > Sounds good Li. I just wanted to

Re: New u...@arrow.apache.org mailing list

2018-08-23 Thread Julian Hyde
Thanks! I think you should also announce on twitter. There may be many followers who feel that dev@ is too heavy for them but @user would be just right. Julian > On Aug 23, 2018, at 9:13 AM, Wes McKinney wrote: > > hi all, > > We have just created a user-oriented mailing list. Please

Re: [DISCUSS] Moving forward on the Arrow-Parquet C++ monorepo project

2018-08-19 Thread Julian Hyde
The votes to grant commit access that you refer to are votes to appoint committers or PMC members. Those votes are conducted in private to prevent embarrassment in case the vote fails, or if the vote passes and the individual declines the offer. I don’t see any such potential embarrassment

Re: Progress on Arrow RPC a.k.a. Arrow Flight

2018-08-16 Thread Julian Hyde
If your use case is SQL RPC, then you are getting close to Avatica's territory. Avatica[1] is a protocol for implementing language-independent JDBC and ODBC stacks. Now, I agree that many ODBC implementations are inefficient. Some ODBC stacks make more round trips than necessary, and do more

Re: [VOTE] Accept donation of Gandiva to Apache Arrow

2018-08-16 Thread Julian Hyde
+1 On Thu, Aug 16, 2018 at 8:56 AM Wes McKinney wrote: > > Dear all, > > The developers of Gandiva, an LLVM-based vectorized expression > evaluation engine for Arrow columnar memory, are proposing to donate > the project to Apache Arrow at some point in the near future, as has > been discussed on

Re: Increasing transparency of corporate support for Apache Arrow development

2018-08-16 Thread Julian Hyde
This is a tough one. I think we need to strike a delicate balance: we should thank companies for being benefactors, but should not put up with bragging (or as Ted puts it, genital comparisons). In Calcite, we allow committers to show their company affiliations[1]. I was initially concerned, but

Re: [ANNOUNCE] Apache Arrow 0.10.0 released

2018-08-07 Thread Julian Hyde
ar.gz.sha256 >> Second issue: I'm not sure what the issue is there. FWIW, they are not >> there for the 0.9.0 release either, and I don't see them in the main >> parquet project either: >> http://apache.cs.utah.edu/parquet/apache-parquet-1.10.0/ >> >> On Tue, Aug

Re: [ANNOUNCE] Apache Arrow 0.10.0 released

2018-08-07 Thread Julian Hyde
Congratulations! One thing: on http://arrow.apache.org/install/ there were the checksums (.tar.gz.asc and .tar.gz.sha512), but I couldn’t find a link to the mirrors with the source tarball (i.e. https://www.apache.org/dyn/closer.cgi/arrow/arrow-0.10.0/

Re: [DISCUSS] Solutions for improving the Arrow-Parquet C++ development morass

2018-07-31 Thread Julian Hyde
A controlled fork doesn’t sound like a terrible option. Copy the code from parquet into arrow, and for a limited period of time it would be the primary. When that period is over, the code in parquet becomes the primary. During the period during which arrow has the primary, the parquet release

Re: [DISCUSS] Solutions for improving the Arrow-Parquet C++ development morass

2018-07-30 Thread Julian Hyde
I'm not going to comment on the design of the parquet-cpp module and whether it is “closer” to parquet or arrow. But I do think Wes’s proposal is consistent with Apache policy. PMCs make releases and govern communities; they don’t exist to manage code bases, except as a means to the end of

Re: Arrow stickers

2018-07-10 Thread Julian Hyde
Thanks for driving this. Can you put the word “apache” in there (in smaller font if you like). That way, if you have the logo on slide 1 of your presentation, you’ve already done your duty to mention the Apache brand. Julian > On Jul 9, 2018, at 19:07, Kelly Stirman wrote: > > Hi everyone!

Re: Housing longer-term Arrow development, design, and roadmap documents

2018-06-26 Thread Julian Hyde
ob curating JIRA, but it would > be helpful to have some kind of high level narrative about the > different areas of the project. > > On Tue, Jun 26, 2018 at 1:21 PM, Julian Hyde wrote: >> I have a bias against wikis of all kinds. If left to their own devices, they >> tend to become

Re: Housing longer-term Arrow development, design, and roadmap documents

2018-06-26 Thread Julian Hyde
I have a bias against wikis of all kinds. If left to their own devices, they tend to become an unstructured mess. Of course, the lack of structure is what makes them useful for what Wes is proposing: gathering knowledge and organizing it as it evolves. But someone will need to play the

Re: Gandiva Initiative

2018-06-22 Thread Julian Hyde
This is exciting. We have wanted to build an Arrow adapter in Calcite for some time and have a prototype (see https://issues.apache.org/jira/browse/CALCITE-2173 ) but I hope that we can use Gandiva. I know that Gandiva has Java bindings, but

Re: Arrow stickers

2018-06-13 Thread Julian Hyde
;> have an "official" logo either. Could the design also be applied >> for >>> this >>>>>> and added to the site, etc? IBM has a design team that might be >> able >>> to >>>>>> help out. Does it make sense to put this

Re: Arrow stickers

2018-06-02 Thread Julian Hyde
Stickers are a great idea. Ask on comdev. Here’s a recent thread I found on the topic. https://lists.apache.org/thread.html/6a73bcc86929d0bd2d4bffb8d9b30f9a5590e872cfc2a884a6de9c5d@%3Cdev.community.apache.org%3E I recall someone (maybe Sharan from comdev) saying that any project can get free

Re: What do people think about a one day get together?

2018-04-09 Thread Julian Hyde
+1 The Arrow community would benefit greatly from a conference/unconference. Remember not to schedule it too close to ApacheCon. Julian > On Apr 9, 2018, at 10:18 AM, Jacques Nadeau wrote: > > Hey all, given that several people are busy in June, let's way until the >

Re: Taking some time off Arrow maintenance / development in April

2018-03-28 Thread Julian Hyde
+1 Great move. Your presence looms large over this project, so stepping back will give others the chance to take the lead. I know (from other projects) how oppressive it is to be the main “go to” person in a project. It irks me, for instance, when people address comments in PRs directly to

Re: [VOTE] Accept donation of Arrow Go implementation

2018-03-07 Thread Julian Hyde
+1 > On Mar 6, 2018, at 3:49 PM, Kouhei Sutou wrote: > > +1 > > In > "Re: [VOTE] Accept donation of Arrow Go implementation" on Tue, 6 Mar 2018 > 15:46:31 -0500, > Li Jin

Re: [DISCUSS] Expanding Arrow interval type metadata, changing Java memory representation

2017-11-08 Thread Julian Hyde
, where users > have a timedelta64[UNIT] type (which results from any arithmetic > between timestamp values) > > On Wed, Nov 8, 2017 at 5:38 PM, Julian Hyde <jh...@apache.org> wrote: >> I don't know many examples of interval being used in the real world. >> But here's

Re: [DISCUSS] Expanding Arrow interval type metadata, changing Java memory representation

2017-11-08 Thread Julian Hyde
mounts of type bloat (which means nothing will fully implement the >> spec and be able to interoperate). >> >> On Sat, Nov 4, 2017 at 3:46 PM, Julian Hyde <jh...@apache.org> wrote: >> >>> As I understand it, the proposal is to have both an interval data type[

Re: JDBC Adapter for Apache-Arrow

2017-11-07 Thread Julian Hyde
hat Julian was suggesting > earlier. > > -Original Message- > From: Julian Hyde [mailto:jh...@apache.org] > Sent: Tuesday, October 31, 2017 4:28 PM > To: dev@arrow.apache.org > Subject: Re: JDBC Adapter for Apache-Arrow > > Yeah, I agree, it should be an interf

Re: JDBC Adapter for Apache-Arrow

2017-11-01 Thread Julian Hyde
http://lmgtfy.com/?q=unsubscribe+apache+arrow <http://lmgtfy.com/?q=unsubscribe+apache+arrow> > On Oct 31, 2017, at 5:20 PM, 丁锦祥 <vence...@gmail.com> wrote: > > unsubscribe > > On Tue, Oct 31, 2017 at 4:28 PM, Julian Hyde <jh...@apache.org> wrot

Re: JDBC Adapter for Apache-Arrow

2017-10-31 Thread Julian Hyde
ase correct me if I am missing anything. > > -Atul > > -Original Message- > From: Julian Hyde [mailto:jhyde.apa...@gmail.com] > Sent: Monday, October 30, 2017 7:50 PM > To: dev@arrow.apache.org > Subject: Re: JDBC Adapter for Apache-Arrow > > How about writing an

Re: JDBC Adapter for Apache-Arrow

2017-10-30 Thread Julian Hyde
How about writing an Arrow adapter for Calcite? I think it amounts to the same thing - you would inherit Calcite’s SQL parser and Avatica JDBC stack. Would this database be ephemeral (i.e. would the data go away when you close the connection)? If not, how would you know where to load the data

Re: [DISCUSS] Updating Arrow's "elevator pitch" on web properties

2017-10-27 Thread Julian Hyde
s one of the keystones of the project > > 3. We are building computation and messaging libraries to be > companions to the columnar format and memory management > > 4. We support many languages (I added "currently" to imply that we are > not closed to new languages) > > - W

Re: [DISCUSS] Updating Arrow's "elevator pitch" on web properties

2017-10-22 Thread Julian Hyde
ted around the mantra of zero-copy. With new >>> architectures designed to leverage non-volatile memory on the horizon, >>> this grows more important with each passing day. >>> >>> - Wes >>> >>> On Sun, Oct 22, 2017 at 7:32 AM, Uwe L. Korn <uw...@xhoc

Re: [DISCUSS] Updating Arrow's "elevator pitch" on web properties

2017-10-21 Thread Julian Hyde
Your proposed version is definitely an improvement. > "Apache Arrow is a cross-language development platform for in-memory > structured data access and analytics. It specifies a standardized > language-independent columnar memory format for flat and hierarchical > data, with support for zero-copy

Fwd: [DISCUSS] Storage-class memory ecosystem program

2017-10-19 Thread Julian Hyde
This thread on general@incubator may be of interest to Arrow. Julian > Begin forwarded message: > > From: "Gang(Gary) Wang" > Subject: [DISCUSS] Storage-class memory ecosystem program > Date: October 19, 2017 at 11:55:46 AM PDT > To: gene...@incubator.apache.org > Cc:

Re: [ANNOUNCE] New Arrow committers: Phillip Cloud and Bryan Cutler

2017-10-03 Thread Julian Hyde
Congratulations and welcome, Philip and Bryan! > On Oct 3, 2017, at 5:27 AM, Wes McKinney wrote: > > On behalf of the Arrow PMC, I'm pleased to announce that Phillip Cloud > and Bryan Cutler have been invited to be Arrow committers. > > We are grateful for your

Re: [DISCUSS] Publishing Arrow development artifacts more frequently for alpha stage components

2017-09-08 Thread Julian Hyde
o approve (rather than the 3 vote requirement)? > > On Fri, Sep 8, 2017 at 4:58 PM, Julian Hyde <jh...@apache.org> wrote: >> >>> On Sep 7, 2017, at 7:06 PM, Wes McKinney <wesmck...@gmail.com> wrote: >>> >>> I personally don't have a problem with subc

Re: [DISCUSS] Publishing Arrow development artifacts more frequently for alpha stage components

2017-09-08 Thread Julian Hyde
> On Sep 7, 2017, at 7:06 PM, Wes McKinney wrote: > > I personally don't have a problem with subcomponents publishing > artifacts to package managers outside of the primary Apache project > votes and releases, so long as they clearly signal that these package > builds are

Re: Apache Arrow at JupyterCon

2017-08-30 Thread Julian Hyde
Thanks for sharing. Can we tweet those videos as well? I see that https://twitter.com/apachearrow only tweeted your slides. > On Aug 26, 2017, at 1:11 PM, Wes McKinney wrote: > > hi all, > > In case folks here are interested, I gave a

Arrow Plasma Object Store - IP clearance

2017-08-07 Thread Julian Hyde
The vote for IP clearance of the Plasma Object Store on the Incubator list has passed[1]. We can now proceed with a release. Julian [1] https://s.apache.org/arrow-plasma-object-store-clearance-result

Fwd: [IP CLEARANCE] Arrow Plasma Object Store

2017-08-02 Thread Julian Hyde
FYI: I started a vote on the Plasma IP. > Begin forwarded message: > > From: Julian Hyde <jh...@apache.org> > Subject: [IP CLEARANCE] Arrow Plasma Object Store > Date: August 2, 2017 at 12:12:27 PM PDT > To: gene...@incubator.apache.org > > Apache Arrow is receivin

Avro Arrow

2017-07-30 Thread Julian Hyde
This news item made me chuckle. What sounds like an interesting mash-up of two Apache projects ended up in a Canadian lake in the 1950s. http://www.torontosun.com/2017/07/28/search-for-long-missing-avro-arrow-models-gets-underway

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-27 Thread Julian Hyde
Does that mean we need to bump the whole project to 2.x? > As more languages come into the fold, this could happen more and more > often. How would people interpret a fast escalating major version > number? > > I am curious how Avro or Thrift have addressed this issue. > >

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Julian Hyde
format >> >> Having this kind of stability is really important so that if any >> systems know how to parse or emit Arrow 1.x data, but aren't >> necessarily using the libraries provided by the project, they can have >> some assurance that we aren't going to break the Flatbuff

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Julian Hyde
he wire. If that makes > sense. > > - Wes > > On Wed, Jul 26, 2017 at 2:35 PM, Julian Hyde <jh...@apache.org> wrote: >> 1.0 is a Big Deal because, under semantic versioning, there is a commitment >> to not change public APIs. If it weren’t for that, 1.0 would have

Re: Adding a Map logical type to the Arrow metadata

2017-07-19 Thread Julian Hyde
d values are all contiguous. > > I agree that sorting is very useful; the metadata for Map should have > a field indicating whether or not the keys are sorted within each map > value > > - Wes > > On Wed, Jul 19, 2017 at 1:37 PM, Julian Hyde <jh...@apache.org> wrote: >>

Re: Adding a Map logical type to the Arrow metadata

2017-07-19 Thread Julian Hyde
List> isn’t the only physical representation that makes sense. Because it doesn’t take advantage of the fact that (a) keys can be re-ordered, (b) keys are unique. So, another viable physical representation would be Struct, with the keys sorted. If keys are constant

Re: Branching for Arrow releases

2017-05-05 Thread Julian Hyde
ge when we come to > it. > > Thanks > Wes > > On Fri, May 5, 2017 at 12:51 PM, Julian Hyde <jh...@apache.org> wrote: > >> I’m fine with either proposal (holding off commits during the release >> vote, or rebasing master afterwards). >> >> I agr

Re: Branching for Arrow releases

2017-05-05 Thread Julian Hyde
I’m fine with either proposal (holding off commits during the release vote, or rebasing master afterwards). I agree with Julien that it’s really nice to have a simple, linear history (with releases on the master branch) and since Arrow is a fairly low-volume project we’re lucky we can do that.

[jira] [Commented] (ARROW-690) Only send JIRA updates to iss...@arrow.apache.org

2017-03-22 Thread Julian Hyde (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936642#comment-15936642 ] Julian Hyde commented on ARROW-690: --- Presumably the creation email would continue to be sent to dev

Re: Moving JIRA updates from dev@ to issues@ only?

2017-03-22 Thread Julian Hyde
And will the initial email, indicating the creation of a JIRA case, continue to be sent to dev@? If so, +1. On Wed, Mar 22, 2017 at 8:55 AM, Wes McKinney wrote: > I created: > > https://issues.apache.org/jira/browse/ARROW-690 > > Since issue traffic is picking up, it may be

Re: Removed from list

2017-03-20 Thread Julian Hyde
Filipe, Did you try the unsubscribe instructions at https://mail-archives.apache.org/mod_mbox/arrow-dev/ ? > On Mar 20, 2017, at 12:51 PM, Filipe Ferreira > wrote: > > Is there a way that my email

[jira] [Commented] (ARROW-637) [Format] Add time zone metadata to Timestamp type

2017-03-20 Thread Julian Hyde (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933463#comment-15933463 ] Julian Hyde commented on ARROW-637: --- I agree with [~wesmckinn]; Arrow is in good shape. Thank you

[jira] [Commented] (ARROW-637) [Format] Add time zone metadata to Timestamp type

2017-03-17 Thread Julian Hyde (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930725#comment-15930725 ] Julian Hyde commented on ARROW-637: --- [~grahn], I know that PostgreSQL's "timestamp with time

Re: Making some decisions about date and time types

2017-03-17 Thread Julian Hyde
Am I correct that timestamp is a 64 bit signed integer representing microseconds since 1970? If so, it would be helpful to state the minimum and maximum values in the spec. I can’t quite imagine a use case for microsecond time, given that it takes the same number of bits as a timestamp. But

[jira] [Commented] (ARROW-637) [Format] Add time zone metadata to Timestamp type

2017-03-16 Thread Julian Hyde (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15928920#comment-15928920 ] Julian Hyde commented on ARROW-637: --- One last quibble. You wrote {quote}Null or length-0 string would

[jira] [Commented] (ARROW-637) [Format] Add time zone metadata to Timestamp type

2017-03-16 Thread Julian Hyde (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15928813#comment-15928813 ] Julian Hyde commented on ARROW-637: --- Looks good. Except that you should support offsets as well

[jira] [Commented] (ARROW-637) [Format] Add time zone metadata to Timestamp type

2017-03-16 Thread Julian Hyde (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15928783#comment-15928783 ] Julian Hyde commented on ARROW-637: --- When you say a "timezone in the Timestamp flatbuffers type&qu

  1   2   >