[
https://issues.apache.org/jira/browse/ARROW-6206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911955#comment-16911955
]
Jim Northrup commented on ARROW-6206:
-------------------------------------
(previsouly responded as email, sorry if this creates a dupe)
I admire Arrow for doing a thing well. I hope that if I simply call “mvn
maven-versions-plugin:latest” in the future this simple jdbc code will work
better than before.
I appreciate the attention to the details.
I think through this discussion the jist is that tensorflow one-hot columns may
quickly test the expected norms of arrow. Likewise, timeseries datasets have
us blowing gaskets all over the place in terms of time-to-completion and RAM
using pandas. What do we do with a 300 gig numpy dataset living in swap that
takes 3 dasy to build? There’s no LSTM examples to demonstrate anything but toy
datasets.
Turbodbc looks like a good fit for reducing transcription times.
For what I need in the space of Arrow, I think the ideal tool is something to
work in and out of numpy and delegate to and from apache Geode or Hazelcast as
the main substrate.
If perchance arrow can act as a window to memory grids, all the better.
As I find the time for signups and 2fa’s I will compose this to the lists
> [Java][Docs] Document environment variables/java properties
> -----------------------------------------------------------
>
> Key: ARROW-6206
> URL: https://issues.apache.org/jira/browse/ARROW-6206
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Documentation, Java
> Reporter: Micah Kornfield
> Assignee: Ji Liu
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.15.0
>
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> Specifically, "-Dio.netty.tryReflectionSetAccessible=true" for JVMs >= 9 and
> BoundsChecking/NullChecking for get.
>
>
--
This message was sent by Atlassian Jira
(v8.3.2#803003)