[jira] [Commented] (DRILL-8495) Tried to remove unmanaged buffer

ASF GitHub Bot (Jira) Wed, 15 May 2024 04:47:05 -0700


    [ 
https://issues.apache.org/jira/browse/DRILL-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17846594#comment-17846594
 ]


ASF GitHub Bot commented on DRILL-8495:
---------------------------------------

rymarm opened a new pull request, #2913:
URL: https://github.com/apache/drill/pull/2913

   # [DRILL-8495](https://issues.apache.org/jira/browse/DRILL-8495): Tried to 
remove unmanaged buffer
   
   The root cause of the issue is that multiple HiveWriters use the same 
`DrillBuf` and during execution they may reallocate the buffer if size of the 
buffer is not enough for a value (256 bytes+). Since 
`drillBuf.reallocIfNeeded(int size)` returns a new instance of `DrillBuf`, all 
other writers still have a reference for the old one buffer, which after 
`drillBuf.reallocIfNeeded(int size)` execution is unmanaged now.
   
   ## Description
   
   `HiveValueWriterFactory` now creates a unique `DrillBif` for each writer. 
   
   HiveWriters are actually used one-by-one and we could utilize a single 
buffer for all the writers. To do this, I could create a class holder for 
`DrillBuf`, so each writer has a reference for the same holder, where will be 
stored a new buffer from every `drillBuf.reallocIfNeeded(int size)` call. But I 
thought that such logic looked slightly confusing and I decided just to let 
each HiveWriter use its own buffer.
   
   ## Documentation
   \-
   
   ## Testing
   Add a new unit test to query a Hive table with variable-length values of 
Binary, VarChar, Char and String types.
   




> Tried to remove unmanaged buffer
> --------------------------------
>
>                 Key: DRILL-8495
>                 URL: https://issues.apache.org/jira/browse/DRILL-8495
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.21.1
>            Reporter: Maksym Rymar
>            Assignee: Maksym Rymar
>            Priority: Major
>
>  
> Drill throws an exception on Hive table:
> {code:java}
>   (java.lang.IllegalStateException) Tried to remove unmanaged buffer.
>     org.apache.drill.exec.ops.BufferManagerImpl.replace():51
>     io.netty.buffer.DrillBuf.reallocIfNeeded():101
>     
> org.apache.drill.exec.store.hive.writers.primitive.HiveStringWriter.write():38
>     
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.readHiveRecordAndInsertIntoRecordBatch():416
>     
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.next():402
>     org.apache.drill.exec.physical.impl.ScanBatch.internalNext():235
>     org.apache.drill.exec.physical.impl.ScanBatch.next():299
>     
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237
>     org.apache.drill.exec.record.AbstractRecordBatch.next():109
>     org.apache.drill.exec.record.AbstractRecordBatch.next():101
>     org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
>     
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():93
>     org.apache.drill.exec.record.AbstractRecordBatch.next():160
>     
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237
>     org.apache.drill.exec.physical.impl.BaseRootExec.next():103
>     
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
>     org.apache.drill.exec.physical.impl.BaseRootExec.next():93
>     org.apache.drill.exec.work.fragment.FragmentExecutor.lambda$run$0():321
>     java.security.AccessController.doPrivileged():-2
>     javax.security.auth.Subject.doAs():422
>     org.apache.hadoop.security.UserGroupInformation.doAs():1899
>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
>     org.apache.drill.common.SelfCleaningRunnable.run():38
>     java.util.concurrent.ThreadPoolExecutor.runWorker():1149
>     java.util.concurrent.ThreadPoolExecutor$Worker.run():624
>     java.lang.Thread.run():748 {code}
>  
>  
> Reproduce:
>  # Create Hive table:
> {code:java}
> create table if NOT EXISTS students(id int, name string, surname string) 
> stored as parquet;{code}
>  # Insert a new row with 2 string values of size > 256 bytes:
> {code:java}
> insert into students values (1, 
> 'Veeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeery
>  long name', 
> 'biiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiig
>  surname');{code}
>  # Execute Drill query:
> {code:java}
> select * from hive.`students` {code}
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (DRILL-8495) Tried to remove unmanaged buffer

Reply via email to