date:20190812

[jira] [Created] (ARROW-6221) [Java] Improve the performance of RangeEqualVisitor for comparing variable-width vectors

2019-08-12 Thread Liya Fan (JIRA)

Liya Fan created ARROW-6221:
---

 Summary: [Java] Improve the performance of RangeEqualVisitor for 
comparing variable-width vectors
 Key: ARROW-6221
 URL: https://issues.apache.org/jira/browse/ARROW-6221
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Java
Reporter: Liya Fan
Assignee: Liya Fan


Two improvements:
 # Compare the whole range of the data buffer, instead of comparing individual 
elements.
 # If two elements are of different sizes, there is no need to compare them.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Re: Running an Arrow hackathon

2019-08-12 Thread Micah Kornfield

Hi David,
This is cool, thank you for doing it.

My thoughts:

> - Is there a label we already use for easy-to-start-with issues? I see
> variations on newbie/easy-fix/beginner on JIRA, is there a preference
> for one?

I think beginner would be my preferred label.  Note that I don't think it
has been consistently applied

- There would (hopefully) be an influx of PRs. We wouldn't expect any
> sort of timeliness on reviews, but it could exacerbate the
> Travis/AppVeyor capacity problem - should I encourage people to set up
> personal Travis instances?

A personal travis instance could be helpful but I think if there isn't a
rush for reviews, they could be staggered to not have too much of an impact
on development.

On Mon, Aug 12, 2019 at 6:55 AM David Li  wrote:

> Hi all,
>
> We're thinking of hosting an internal open-source hackathon in
> September. I wanted to make Apache Arrow one of the projects we work
> on, so I wanted to give maintainers here a heads up, and clarify a few
> things.
>
> I would be around to help set up environments and make sure that PRs
> follow the expected format. I could also do first-pass reviews. We
> would focus on Python/Java/Rust as those have the most interest
> (though maybe we could snag a few Gophers).
>
> At this point I'm not sure how many participants we'll have - most
> likely no more than 10 or so.
>
> - Is there a label we already use for easy-to-start-with issues? I see
> variations on newbie/easy-fix/beginner on JIRA, is there a preference
> for one?
> - There would (hopefully) be an influx of PRs. We wouldn't expect any
> sort of timeliness on reviews, but it could exacerbate the
> Travis/AppVeyor capacity problem - should I encourage people to set up
> personal Travis instances?
>
> Thanks,
> David
>

[jira] [Created] (ARROW-6220) [Java] Add API to avro adapter to limit number of rows returned at a time.

2019-08-12 Thread Micah Kornfield (JIRA)

Micah Kornfield created ARROW-6220:
--

 Summary: [Java] Add API to avro adapter to limit number of rows 
returned at a time.
 Key: ARROW-6220
 URL: https://issues.apache.org/jira/browse/ARROW-6220
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Java
Reporter: Micah Kornfield


We can either let clients iterate or ideally provide an iterator interface.  
This is important for large avro data and was also discussed as something 
readers/adapters should haven.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (ARROW-6219) [Java] Add API for JDBC adapter that can convert less then the full result set at a time.

2019-08-12 Thread Micah Kornfield (JIRA)

Micah Kornfield created ARROW-6219:
--

 Summary: [Java] Add API for JDBC adapter that can convert less 
then the full result set at a time.
 Key: ARROW-6219
 URL: https://issues.apache.org/jira/browse/ARROW-6219
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Java
Reporter: Micah Kornfield


Somehow we should configure number of rows per batch and either let clients 
iterate or provide an iterator API.  Otherwise for large result sets we might 
run out of memory.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (ARROW-6218) [Java] Add UINT type test in integration to avoid potential overflow

2019-08-12 Thread Ji Liu (JIRA)

Ji Liu created ARROW-6218:
-

 Summary: [Java] Add UINT type test in integration to avoid 
potential overflow
 Key: ARROW-6218
 URL: https://issues.apache.org/jira/browse/ARROW-6218
 Project: Apache Arrow
  Issue Type: Test
  Components: Java
Reporter: Ji Liu
Assignee: Ji Liu


As per discussion [https://github.com/apache/arrow/pull/5002]

For UINT type, when write/read json data in integration test, it extend data 
type(i.e. Long->BigInteger, Int->Long) to avoid potential overflow.

Like UINT8 the write side and read side code like this:

 
{code:java}
case UINT8:

  generator.writeNumber(UInt8Vector.getNoOverflow(buffer, index));

  break;{code}
 
{code:java}
BigInteger value = parser.getBigIntegerValue();

buf.writeLong(value.longValue());
{code}
Should add a test to avoid potential overflow in the data transfer process.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Re: [Discussion][Java] Redesign the dictionary encoder

2019-08-12 Thread Micah Kornfield

Hi Liya Fan,
Ji Liu has an open pull request [1] that refactors the existing
implementation to address the re-use aspect.  I think it can also be
extended to fix the memory ownership problem you highlighted.  More work
would need to be address to address the customizable hash customizable
hash.  Could you two please work together to figure out how to reconcile
the following differences:

1.  The new implementations you referenced, require access to an
ArrowBufPointer which precludes the usage on complex types.  The existing
implementation works with complex types.

2.  The existing implementation has a customized hash table that avoids the
need for boxing/unboxing.  If I remember correctly I think this showed
approximately 3-5% performance improvement in encoding.  In both cases, it
would probably be nice to move to an off-heap solution.

Also, for removing the old encoder implementation could you provide more
details?  The current encoder is used in the Vector module in unit tests at
least, and the new encoders are in the algorithm package.  How do you plan
on resolving the dependencies?

[1]  https://github.com/apache/arrow/pull/5055/files

On Sun, Aug 11, 2019 at 1:18 AM Fan Liya  wrote:

> Dear all,
>
> Dictionary encoding is an important feature, so it should be implemented
> with good performance.
> The current Java dictionary encoder implementation is based on static
> utility methods in org.apache.arrow.vector.dictionary.DictionaryEncoder,
> which has heavy performance overhead, preventing it from being useful in
> practice:
>
> 1. The hash table cannot be reused for encoding multiple vectors (other
> data structure & results cannot be reused either).
> 2. The output vector should not be created/managed by the encoder (just
> like in the out-of-place sorter)
> 3. Different scenarios requires different algorithms to compute the hash
> code to avoid conflicts in the hash table, but this is not supported.
>
> Although some problems can be overcome by refactoring the current
> implementation, it is difficult to do so without significantly chaning the
> current API.
> So we propse new design [1][2] of the dictionary encoder, to make it more
> performant in practice.
>
> We plan to implement the new dictionary encoders with stateful objects, so
> many useful partial/immediate results can be reused. The new encoders
> support using different hash code algorithms in different scenarios to
> achieve good performance.
>
> We plan to support the new encoders in the following steps:
>
> 1. implement the new dictionary encoders in the algorithm module [3][4]
> 2. make the old dictionary encoder deprecated
> 3. remove the old encoder implementations
>
> Please give your valuable comments.
>
> Best,
> Liya Fan
>
> [1] https://issues.apache.org/jira/browse/ARROW-5917
> [2] https://issues.apache.org/jira/browse/ARROW-6184
> [3] https://github.com/apache/arrow/pull/4994
> [4] https://github.com/apache/arrow/pull/5058
>

[jira] [Created] (ARROW-6217) [Website] Remove needless _site/ directory

2019-08-12 Thread Sutou Kouhei (JIRA)

Sutou Kouhei created ARROW-6217:
---

 Summary: [Website] Remove needless _site/ directory
 Key: ARROW-6217
 URL: https://issues.apache.org/jira/browse/ARROW-6217
 Project: Apache Arrow
  Issue Type: Task
  Components: Website
Reporter: Sutou Kouhei
Assignee: Sutou Kouhei






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (ARROW-6216) Allow user to select the ZSTD compression level

2019-08-12 Thread Martin Radev (JIRA)

Martin Radev created ARROW-6216:
---

 Summary: Allow user to select the ZSTD compression level
 Key: ARROW-6216
 URL: https://issues.apache.org/jira/browse/ARROW-6216
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Martin Radev


The compression level selected in Arrow for ZSTD is 1 which is the minimal 
compression level for the compressor. This leads to very high compression speed 
at the sacrifice of compression ratio.

The user should be allowed to select the compression level as both speed and 
ratio are data specific.

The proposed solution is to expose the knob via an environment variable such as 
ARROW_ZSTD_COMPRESSION_LEVEL.
Example:
export ARROW_ZSTD_COMPRESSION_LEVEL=10
./my_parquet_app



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (ARROW-6215) [Java] RangeEqualVisitor does not properly compare ZeroVector

2019-08-12 Thread Bryan Cutler (JIRA)

Bryan Cutler created ARROW-6215:
---

 Summary: [Java] RangeEqualVisitor does not properly compare 
ZeroVector
 Key: ARROW-6215
 URL: https://issues.apache.org/jira/browse/ARROW-6215
 Project: Apache Arrow
  Issue Type: Bug
  Components: Java
Reporter: Bryan Cutler
Assignee: Bryan Cutler


ZeroVector.accept and RangeEqualVisitor always return True no matter what type 
of other vector is compared



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (ARROW-6214) Sanitizer errors triggered via R bindings

2019-08-12 Thread Jeroen (JIRA)

Jeroen created ARROW-6214:
-

 Summary: Sanitizer errors triggered via R bindings
 Key: ARROW-6214
 URL: https://issues.apache.org/jira/browse/ARROW-6214
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++, R
Affects Versions: 0.14.1
 Environment: Linux
Reporter: Jeroen


When we run the examples of the R package through the sanitizers, several 
errors show up. These could be related to the segfaults we saw on the macos 
builder on CRAN.

Steps to reproduce + example outputs at: 
https://gist.github.com/jeroen/111901c351a4089a9effa90691a1dd81




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Re: [DISCUSS] Developing a "data frame" subproject in the Arrow C++ libraries

2019-08-12 Thread Wes McKinney

hi Eric -- there have not been any patches yet related to it. I'm
currently in the midst of some internal restructuring of the Parquet
C++ library to address long-standing efficiency and memory use issues.
It's my intention to spend time on the data frame project as one of my
next focus areas, likely to be after Labor Day.

- Wes

On Mon, Aug 12, 2019 at 10:28 AM Eric Erhardt
 wrote:
>
> Hey Wes,
>
> I just wanted to check-in on this work. Have there been any updates to the 
> Arrow "data frame" project worth sharing?
>
> Thanks,
> Eric
>
> -Original Message-
> From: Wes McKinney 
> Sent: Tuesday, May 21, 2019 8:17 AM
> To: dev@arrow.apache.org
> Subject: Re: [DISCUSS] Developing a "data frame" subproject in the Arrow C++ 
> libraries
>
> On Tue, May 21, 2019, 8:43 AM Antoine Pitrou  wrote:
>
> >
> > Le 21/05/2019 à 13:42, Wes McKinney a écrit :
> > > hi Antoine,
> > >
> > > On Tue, May 21, 2019 at 5:48 AM Antoine Pitrou 
> > wrote:
> > >>
> > >>
> > >> Hi Wes,
> > >>
> > >> How does copy-on-write play together with memory-mapped data?  It
> > >> seems that, depending on whether the memory map has several
> > >> concurrent users (a condition which may be timing-dependent), we
> > >> will either persist changes on disk or make them ephemeral in
> > >> memory.  That doesn't sound very user-friendly, IMHO.
> > >
> > > With memory-mapping, any Buffer is sliced from the parent MemoryMap
> > > [1] so mutating the data on disk using this interface wouldn't be
> > > possible with the way that I've framed it.
> >
> > Hmm... I always forget that SliceBuffer returns a read-only view.
> >
>
> The more important issue is that parent_ is non-null. The idea is that no 
> mutation is allowed if we reason that another Buffer object has access to the 
> address space of interest. I think this style of copy-on-write is a 
> reasonable compromise that prevents most kinds of defensive copying.
>
>
> > Regards
> >
> > Antoine.
> >

RE: [DISCUSS] Developing a "data frame" subproject in the Arrow C++ libraries

2019-08-12 Thread Eric Erhardt

Hey Wes,

I just wanted to check-in on this work. Have there been any updates to the 
Arrow "data frame" project worth sharing?

Thanks,
Eric

-Original Message-
From: Wes McKinney  
Sent: Tuesday, May 21, 2019 8:17 AM
To: dev@arrow.apache.org
Subject: Re: [DISCUSS] Developing a "data frame" subproject in the Arrow C++ 
libraries

On Tue, May 21, 2019, 8:43 AM Antoine Pitrou  wrote:

>
> Le 21/05/2019 à 13:42, Wes McKinney a écrit :
> > hi Antoine,
> >
> > On Tue, May 21, 2019 at 5:48 AM Antoine Pitrou 
> wrote:
> >>
> >>
> >> Hi Wes,
> >>
> >> How does copy-on-write play together with memory-mapped data?  It 
> >> seems that, depending on whether the memory map has several 
> >> concurrent users (a condition which may be timing-dependent), we 
> >> will either persist changes on disk or make them ephemeral in 
> >> memory.  That doesn't sound very user-friendly, IMHO.
> >
> > With memory-mapping, any Buffer is sliced from the parent MemoryMap 
> > [1] so mutating the data on disk using this interface wouldn't be 
> > possible with the way that I've framed it.
>
> Hmm... I always forget that SliceBuffer returns a read-only view.
>

The more important issue is that parent_ is non-null. The idea is that no 
mutation is allowed if we reason that another Buffer object has access to the 
address space of interest. I think this style of copy-on-write is a reasonable 
compromise that prevents most kinds of defensive copying.

> Regards
>
> Antoine.
>

[jira] [Created] (ARROW-6213) C++ tests fails for AVX512

2019-08-12 Thread Charles Coulombe (JIRA)

Charles Coulombe created ARROW-6213:
---

 Summary: C++ tests fails for AVX512
 Key: ARROW-6213
 URL: https://issues.apache.org/jira/browse/ARROW-6213
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Affects Versions: 0.14.1
 Environment: CentOS 7.6.1810, Intel Xeon Processor (Skylake, IBRS) 
avx512
Reporter: Charles Coulombe
 Attachments: arrow-0.14.1-c++-failed-tests-cmake-conf.txt, 
arrow-0.14.1-c++-failed-tests.txt

When building libraries for avx512 with GCC 7.3.0, two C++ tests fails.
{code:java}
The following tests FAILED:
  28 - arrow-compute-compare-test (Failed)
  30 - arrow-compute-filter-test (Failed)
Errors while running CTest
{code}
while for avx2 they passes.

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Re: Proposal to move website source to arrow-site, add automatic builds

2019-08-12 Thread Neal Richardson

I started https://github.com/apache/arrow/pull/5015 for the removal
last week; will finish that up today or tomorrow.

Neal

On Sun, Aug 11, 2019 at 8:23 AM Wes McKinney  wrote:
>
> It looks like the git pruning is done. So we can remove the site/
> directory from the main repository at some point soon.
>
> On Thu, Aug 8, 2019 at 2:29 PM Neal Richardson
>  wrote:
> >
> > I need a committer to make a master branch on arrow-site so that I can
> > PR to it. I thought it could be just an empty orphan branch but that
> > proved not to work, so a committer will need to do the following:
> >
> > ```
> > git clone g...@github.com:$YOURGITHUB/arrow.git arrow-copy
> > cd arrow-copy
> > git filter-branch --prune-empty --subdirectory-filter site master
> > vi .git/config
> > # Change remote "origin"'s URL to be g...@github.com:arrow/arrow-site.git
> > git push -f origin master
> > ```
> >
> > On Thu, Aug 8, 2019 at 12:07 PM Wes McKinney  wrote:
> > >
> > > Yes, I think we have adequate lazy consensus. Can you spell out what
> > > are the next steps?
> > >
> > > On Thu, Aug 8, 2019 at 2:01 PM Neal Richardson
> > >  wrote:
> > > >
> > > > Have we reached "lazy consensus" here? No further comments in the last
> > > > three days.
> > > >
> > > > Thanks,
> > > > Neal
> > > >
> > > > On Mon, Aug 5, 2019 at 1:46 PM Joris Van den Bossche
> > > >  wrote:
> > > > >
> > > > > This sounds as a good proposal to me (at least at the moment where we 
> > > > > have
> > > > > separate docs and main site).
> > > > > I agree that documentation should indeed stay with the code, as you 
> > > > > want to
> > > > > update those together in PRs. But the website is something you can
> > > > > typically update separately and also might want to update 
> > > > > independently
> > > > > from code releases. And certainly if this proposal makes it easier to 
> > > > > work
> > > > > on the site, all the better.
> > > > >
> > > > > Joris
> > > > >
> > > > > Op ma 5 aug. 2019 20:30 schreef Wes McKinney :
> > > > >
> > > > > > Let's wait a little while to collect any additional opinions about 
> > > > > > this.
> > > > > >
> > > > > > There's pretty good evidence from other Apache projects that this
> > > > > > isn't too bad of an idea
> > > > > >
> > > > > > Apache Calcite: https://github.com/apache/calcite-site
> > > > > > Apache Kafka: https://github.com/apache/kafka-site
> > > > > > Apache Spark: https://github.com/apache/spark-website
> > > > > >
> > > > > > The Apache projects I've seen where the same repository is used for
> > > > > > $FOO.apache.org tend to be ones where the documentation _is_ the
> > > > > > website. I think we would need to commission a significant web 
> > > > > > design
> > > > > > overhaul to be able to make our documentation page adequate as the
> > > > > > landing point for visitors to https://arrow.apache.org.
> > > > > >
> > > > > > On Sat, Aug 3, 2019 at 3:46 PM Neal Richardson
> > > > > >  wrote:
> > > > > > >
> > > > > > > Given the status quo, it would be difficult for this to make the 
> > > > > > > Arrow
> > > > > > > website less maintained. In fact, arrow-site is currently missing 
> > > > > > > the
> > > > > > > most recent two patches that modified the site directory in
> > > > > > > apache/arrow. Having multiple manual deploy steps increases the
> > > > > > > likelihood that the website stays stale.
> > > > > > >
> > > > > > > As someone who has been working on the arrow site lately, this
> > > > > > > proposal makes it easier for me to make changes to the website 
> > > > > > > because
> > > > > > > I can automatically deploy my changes to a test site, and that 
> > > > > > > lets
> > > > > > > others in the community, who perhaps don't touch the website much,
> > > > > > > verify that they're good.
> > > > > > >
> > > > > > > I agree that the documentation situation needs attention, but as I
> > > > > > > said initially, that's orthogonal to this static site generation. 
> > > > > > > I'd
> > > > > > > like to work on that next, and I think these changes will make it
> > > > > > > easier to do. I would not propose moving doc generation out of
> > > > > > > apache/arrow--that belongs with the code.
> > > > > > >
> > > > > > > Neal
> > > > > > >
> > > > > > > On Sat, Aug 3, 2019 at 9:49 AM Wes McKinney  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > I think that the project website and the project documentation 
> > > > > > > > are
> > > > > > > > currently distinct entities. The current Jekyll website is 
> > > > > > > > independent
> > > > > > > > from the Sphinx documentation project aside from a link to the
> > > > > > > > documentation from the website.
> > > > > > > >
> > > > > > > > I am guessing that we would want to maintain some amount of 
> > > > > > > > separation
> > > > > > > > between the main site at arrow.apache.org and the code / format
> > > > > > > > documentation, at minimum because we may want to make 
> > > > > > > > documentation
> > > > > > > > available for multiple

Running an Arrow hackathon

2019-08-12 Thread David Li

Hi all,

We're thinking of hosting an internal open-source hackathon in
September. I wanted to make Apache Arrow one of the projects we work
on, so I wanted to give maintainers here a heads up, and clarify a few
things.

I would be around to help set up environments and make sure that PRs
follow the expected format. I could also do first-pass reviews. We
would focus on Python/Java/Rust as those have the most interest
(though maybe we could snag a few Gophers).

At this point I'm not sure how many participants we'll have - most
likely no more than 10 or so.

- Is there a label we already use for easy-to-start-with issues? I see
variations on newbie/easy-fix/beginner on JIRA, is there a preference
for one?
- There would (hopefully) be an influx of PRs. We wouldn't expect any
sort of timeliness on reviews, but it could exacerbate the
Travis/AppVeyor capacity problem - should I encourage people to set up
personal Travis instances?

Thanks,
David

Re: [Discuss][FlightRPC] Extensions to Flight: middleware and DoPut tickets

2019-08-12 Thread David Li

I've (finally) put up a draft implementation of middleware for Java:
https://github.com/apache/arrow/pull/5068

Hopefully this helps clarify how the proposal works.

Best,
David

On 7/25/19, David Li  wrote:
> Thanks for the feedback, Antoine. That would be a natural method to
> have - then the server could deny uploads (as you mention) or note
> that the stream already exists. I've updated the proposal to reflect
> that, leaving more detailed semantics (e.g. append vs overwrite)
> application-defined.
>
> Best,
> David
>
> On 7/25/19, Antoine Pitrou  wrote:
>>
>> Le 08/07/2019 à 16:33, David Li a écrit :
>>> Hi all,
>>>
>>> I've put together two more proposals for Flight, motivated by projects
>>> we've been working on. I'd appreciate any comments on the
>>> design/reasoning; I'm already working on the implementation, alongside
>>> some other improvements to Flight.
>>>
>>> The first is to modify the DoPut call to follow the same request
>>> pattern as DoGet. This is a format change and would require a vote.
>>>
>>> https://docs.google.com/document/d/1hrwxNwPU1aOD_1ciRUOaGeUCyXYOmu6IxxCfY6Stj6w/edit?usp=sharing
>>
>> It seems it would be useful to introduce a GetPutInfo (or GetUploadInfo)
>> so as to allow differential behaviour between getting and putting.
>>
>> (one trivial case would be to disallow uploading altogether :-)))
>>
>> Regards
>>
>> Antoine.
>>
>

[jira] [Created] (ARROW-6212) [Java] Support vector rank operation

2019-08-12 Thread Liya Fan (JIRA)

Liya Fan created ARROW-6212:
---

 Summary: [Java] Support vector rank operation
 Key: ARROW-6212
 URL: https://issues.apache.org/jira/browse/ARROW-6212
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Java
Reporter: Liya Fan
Assignee: Liya Fan


Given an unsorted vector, we want to get the index of the ith smallest element 
in the vector. This function is supported by the rank operation. 

We provide an implementation that gets the index with the desired rank, without 
sorting the vector (the vector is left intact), and the implementation takes 
O(n) time, where n is the vector length.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Re: [ANNOUNCE] New Arrow PMC member: Micah Kornfield

2019-08-12 Thread Praveen Kumar

Congrats Micah, very well deserved !!

On Mon, Aug 12, 2019 at 8:35 AM Micah Kornfield 
wrote:

> Thanks everyone for the good wishes!
>
> On Fri, Aug 9, 2019 at 5:41 PM Fan Liya  wrote:
>
> > Big congratulations! Micah
> > Thank you so much for all the help!
> >
> > Best,
> > Liya Fan
> >
> > On Saturday, August 10, 2019, Brian Hulette  wrote:
> > > Congratulations Micah! Well deserved :)
> > >
> > > On Fri, Aug 9, 2019 at 9:02 AM Francois Saint-Jacques <
> > > fsaintjacq...@gmail.com> wrote:
> > >
> > >> Congrats!
> > >>
> > >> well deserved.
> > >>
> > >> On Fri, Aug 9, 2019 at 11:12 AM Wes McKinney 
> > wrote:
> > >> >
> > >> > The Project Management Committee (PMC) for Apache Arrow has invited
> > >> > Micah Kornfield to become a PMC member and we are pleased to
> announce
> > >> > that Micah has accepted.
> > >> >
> > >> > Congratulations and welcome!
> > >>
> > >
> >
>

[jira] [Created] (ARROW-6211) [Java] Remove dependency on RangeEqualsVisitor from ValueVector interface

2019-08-12 Thread Pindikura Ravindra (JIRA)

Pindikura Ravindra created ARROW-6211:
-

 Summary: [Java] Remove dependency on RangeEqualsVisitor from 
ValueVector interface
 Key: ARROW-6211
 URL: https://issues.apache.org/jira/browse/ARROW-6211
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Pindikura Ravindra


This is a follow-up from [https://github.com/apache/arrow/pull/4933]

 

public interface VectorVisitor \{..}

 

In ValueVector : 

public  OUT accept(VectorVisitor 
visitor, IN value) throws EX;

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (ARROW-6210) [Java] remove equals API from ValueVector

2019-08-12 Thread Pindikura Ravindra (JIRA)

Pindikura Ravindra created ARROW-6210:
-

 Summary: [Java] remove equals API from ValueVector
 Key: ARROW-6210
 URL: https://issues.apache.org/jira/browse/ARROW-6210
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Pindikura Ravindra


This is a follow-up from [https://github.com/apache/arrow/pull/4933]

The callers should be fixed to use the RangeEquals API instead.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (ARROW-6209) [Java] Extract set null method to the base class for fixed width vectors

2019-08-12 Thread Liya Fan (JIRA)

Liya Fan created ARROW-6209:
---

 Summary: [Java] Extract set null method to the base class for 
fixed width vectors
 Key: ARROW-6209
 URL: https://issues.apache.org/jira/browse/ARROW-6209
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Java
Reporter: Liya Fan
Assignee: Liya Fan


Currently, each fixed width vector has the setNull method. All these 
implementations are identical, so we move them to the base class. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (ARROW-6208) [Java] Correct byte order before comparing in ByteFunctionHelpers

2019-08-12 Thread Prudhvi Porandla (JIRA)

Prudhvi Porandla created ARROW-6208:
---

 Summary: [Java] Correct byte order before comparing in 
ByteFunctionHelpers
 Key: ARROW-6208
 URL: https://issues.apache.org/jira/browse/ARROW-6208
 Project: Apache Arrow
  Issue Type: Bug
  Components: Java
Affects Versions: 1.0.0
Reporter: Prudhvi Porandla
Assignee: Prudhvi Porandla
 Fix For: 1.0.0






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (ARROW-6221) [Java] Improve the performance of RangeEqualVisitor for comparing variable-width vectors

Re: Running an Arrow hackathon

[jira] [Created] (ARROW-6220) [Java] Add API to avro adapter to limit number of rows returned at a time.

[jira] [Created] (ARROW-6219) [Java] Add API for JDBC adapter that can convert less then the full result set at a time.

[jira] [Created] (ARROW-6218) [Java] Add UINT type test in integration to avoid potential overflow

Re: [Discussion][Java] Redesign the dictionary encoder

[jira] [Created] (ARROW-6217) [Website] Remove needless _site/ directory

[jira] [Created] (ARROW-6216) Allow user to select the ZSTD compression level

[jira] [Created] (ARROW-6215) [Java] RangeEqualVisitor does not properly compare ZeroVector

[jira] [Created] (ARROW-6214) Sanitizer errors triggered via R bindings

Re: [DISCUSS] Developing a "data frame" subproject in the Arrow C++ libraries

RE: [DISCUSS] Developing a "data frame" subproject in the Arrow C++ libraries

[jira] [Created] (ARROW-6213) C++ tests fails for AVX512

Re: Proposal to move website source to arrow-site, add automatic builds

Running an Arrow hackathon

Re: [Discuss][FlightRPC] Extensions to Flight: middleware and DoPut tickets

[jira] [Created] (ARROW-6212) [Java] Support vector rank operation

Re: [ANNOUNCE] New Arrow PMC member: Micah Kornfield

[jira] [Created] (ARROW-6211) [Java] Remove dependency on RangeEqualsVisitor from ValueVector interface

[jira] [Created] (ARROW-6210) [Java] remove equals API from ValueVector

[jira] [Created] (ARROW-6209) [Java] Extract set null method to the base class for fixed width vectors

[jira] [Created] (ARROW-6208) [Java] Correct byte order before comparing in ByteFunctionHelpers

22 matches

Site Navigation

Mail list logo

Footer information