[ 
https://issues.apache.org/jira/browse/ARROW-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905982#comment-16905982
 ] 

Jim Northrup commented on ARROW-6202:
-------------------------------------

for the record we have pre-tensorflow column counts of about 14000 one-hot 
attributes.  we are seeing numpy RAM requirements of 160 gigs



> [Java] Exception in thread "main" 
> org.apache.arrow.memory.OutOfMemoryException: Unable to allocate buffer of 
> size 4 due to memory limit. Current allocation: 2147483646
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-6202
>                 URL: https://issues.apache.org/jira/browse/ARROW-6202
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Java
>    Affects Versions: 0.14.1
>            Reporter: Jim Northrup
>            Priority: Major
>              Labels: jdbc
>
> jdbc query results exceed native heap when using generous -Xmx settings. 
> for roughly 800 megabytes of csv/flatfile resultset, arrow is unable to house 
> the contents in RAM long enough to persist to disk, without explicit 
> knowledge beyond unit test sample code.
> source:
> https://github.com/jnorthrup/jdbc2json/blob/master/src/main/java/com/fnreport/QueryToFeather.kt#L83
> {code:java}
> Exception in thread "main" org.apache.arrow.memory.OutOfMemoryException: 
> Unable to allocate buffer of size 4 due to memory limit. Current allocation: 
> 2147483646
>         at 
> org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:307)
>         at 
> org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:277)
>         at 
> org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.updateVector(JdbcToArrowUtils.java:610)
>         at 
> org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToFieldVector(JdbcToArrowUtils.java:462)
>         at 
> org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:396)
>         at 
> org.apache.arrow.adapter.jdbc.JdbcToArrow.sqlToArrow(JdbcToArrow.java:225)
>         at 
> org.apache.arrow.adapter.jdbc.JdbcToArrow.sqlToArrow(JdbcToArrow.java:187)
>         at 
> org.apache.arrow.adapter.jdbc.JdbcToArrow.sqlToArrow(JdbcToArrow.java:156)
>         at com.fnreport.QueryToFeather$Companion.go(QueryToFeather.kt:83)
>         at 
> com.fnreport.QueryToFeather$Companion$main$1.invokeSuspend(QueryToFeather.kt:95)
>         at 
> kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
>         at kotlinx.coroutines.DispatchedTask.run(Dispatched.kt:241)
>         at 
> kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:270)
>         at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:79)
>         at 
> kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:54)
>         at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
>         at 
> kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:36)
>         at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
>         at com.fnreport.QueryToFeather$Companion.main(QueryToFeather.kt:93)
>         at com.fnreport.QueryToFeather.main(QueryToFeather.kt)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to