[ 
https://issues.apache.org/jira/browse/DRILL-7449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014414#comment-17014414
 ] 

Igor Guzenko commented on DRILL-7449:
-------------------------------------

Hello [~benj641], 

Unfortunately, I wasn't able to reproduce the issue from the description. I've 
generated 500k, 1 and 10 million JSON rows with unique URLs and used Drill 1.16 
and latest master build in embedded mode. Both versions of Drill did well for 
the query from the description. Could you please share more details about your 
query (query profile) and cluster topology?

Thank you in advance, 
Igor


> memory leak parse_url function
> ------------------------------
>
>                 Key: DRILL-7449
>                 URL: https://issues.apache.org/jira/browse/DRILL-7449
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill
>    Affects Versions: 1.16.0
>            Reporter: benj
>            Assignee: Igor Guzenko
>            Priority: Major
>
> Requests with *parse_url* works well when the number of treated rows is low 
> but produce memory leak when number of rows grows (~ between 500 000 and 1 
> million) (and for certain number of row sometimes the request works and 
> sometimes it failed with memory leaks)
> Extract from dataset tested:
> {noformat}
> {"Attributable":true,"Description":"Website has been identified as malicious 
> by 
> Bing","FirstReportedDateTime":"2018-03-12T18:49:38Z","IndicatorExpirationDateTime":"2018-04-11T23:33:13Z","IndicatorProvider":"Bing","IndicatorThreatType":"MaliciousUrl","IsPartnerShareable":true,"IsProductLicensed":true,"LastReportedDateTime":"2018-03-12T18:49:38Z","NetworkDestinationAsn":15169,"NetworkDestinationIPv4":"172.217.8.193","NetworkDestinationPort":80,"Tags":["us"],"ThreatDetectionProduct":"ES","TLPLevel":"Amber","Url":"http://pasuruanbloggers.blogspot.ru/2012/12/beginilah-cara-orang-jepang-berpacaran.html","Version":1.5}
> {"Attributable":true,"Description":"Website has been identified as malicious 
> by 
> Bing","FirstReportedDateTime":"2018-03-12T18:14:51Z","IndicatorExpirationDateTime":"2018-04-11T23:33:13Z","IndicatorProvider":"Bing","IndicatorThreatType":"MaliciousUrl","IsPartnerShareable":true,"IsProductLicensed":true,"LastReportedDateTime":"2018-03-12T18:14:51Z","NetworkDestinationAsn":15169,"NetworkDestinationIPv4":"216.58.192.193","NetworkDestinationPort":80,"Tags":["us"],"ThreatDetectionProduct":"ES","TLPLevel":"Amber","Url":"http://pasuruanbloggers.blogspot.ru/2012/12/cara-membuat-widget-slideshow-postingan.html","Version":1.5}
> {noformat}
> Request tested:
> {code:sql}
> ALTER SESSION SET `store.format`='parquet';
> ALTER SESSION SET `store.parquet.use_new_reader` = true;
> ALTER SESSION SET `store.parquet.compression` = 'snappy';
> ALTER SESSION SET `drill.exec.functions.cast_empty_string_to_null`= true;
> ALTER SESSION SET `store.json.all_text_mode` = true;
> ALTER SESSION SET `exec.enable_union_type` = true;
> ALTER SESSION SET `store.json.all_text_mode` = true;
> CREATE TABLE dfs.test.`output_pqt` AS
> (
> SELECT R.parsed.host AS Domain
> FROM ( 
>   SELECT parse_url(T.Url) AS parsed
>   FROM dfs.test.`file.json` AS T
> ) AS R 
> ORDER BY Domain
> );
> {code}
>  
>  Result when memory leak:
> {noformat}
> Error: SYSTEM ERROR: IllegalStateException: Memory was leaked by query. 
> Memory leaked: (256)
> Allocator(frag:3:0) 3000000/256/9337280/30000000000 (res/actual/peak/limit)
> Fragment 3:0
> Please, refer to logs for more information.
> [Error Id: 3ffa5b43-0dde-4518-bb5a-ea3aab97f3d4 on servor01:31010]
>   (java.lang.IllegalStateException) Memory was leaked by query. Memory 
> leaked: (256)
> Allocator(frag:3:0) 3000000/256/9337280/30000000000 (res/actual/peak/limit)
>     org.apache.drill.exec.memory.BaseAllocator.close():520
>     org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose():552
>     org.apache.drill.exec.ops.FragmentContextImpl.close():546
>     
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources():386
>     org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup():214
>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():329
>     org.apache.drill.common.SelfCleaningRunnable.run():38
>     java.util.concurrent.ThreadPoolExecutor.runWorker():1149
>     java.util.concurrent.ThreadPoolExecutor$Worker.run():624
>     java.lang.Thread.run():748 (state=,code=0)
> java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Memory was leaked 
> by query. Memory leaked: (256)
> Allocator(frag:3:0) 3000000/256/9337280/30000000000 (res/actual/peak/limit)
> Fragment 3:0
> Please, refer to logs for more information.
> [Error Id: 3ffa5b43-0dde-4518-bb5a-ea3aab97f3d4 on servor01:31010]
>   (java.lang.IllegalStateException) Memory was leaked by query. Memory 
> leaked: (256)
> Allocator(frag:3:0) 3000000/256/9337280/30000000000 (res/actual/peak/limit)
>     org.apache.drill.exec.memory.BaseAllocator.close():520
>     org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose():552
>     org.apache.drill.exec.ops.FragmentContextImpl.close():546
>     
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources():386
>     org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup():214
>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():329
>     org.apache.drill.common.SelfCleaningRunnable.run():38
>     java.util.concurrent.ThreadPoolExecutor.runWorker():1149
>     java.util.concurrent.ThreadPoolExecutor$Worker.run():624
>     java.lang.Thread.run():748
>       at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:538)
>       at 
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:610)
>       at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1278)
>       at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:58)
>       at 
> org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667)
>       at 
> org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1107)
>       at 
> org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1118)
>       at 
> org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
>       at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:200)
>       at 
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
>       at 
> org.apache.calcite.avatica.AvaticaStatement.execute(AvaticaStatement.java:217)
>       at sqlline.Commands.execute(Commands.java:971)
>       at sqlline.Commands.sql(Commands.java:906)
>       at sqlline.SqlLine.dispatch(SqlLine.java:740)
>       at sqlline.SqlLine.runCommands(SqlLine.java:1691)
>       at sqlline.Commands.run(Commands.java:1513)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
>       at sqlline.SqlLine.dispatch(SqlLine.java:736)
>       at sqlline.SqlLine.initArgs(SqlLine.java:475)
>       at sqlline.SqlLine.begin(SqlLine.java:531)
>       at sqlline.SqlLine.start(SqlLine.java:270)
>       at sqlline.SqlLine.main(SqlLine.java:201)
> Caused by: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM 
> ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (256)
> Allocator(frag:3:0) 3000000/256/9337280/30000000000 (res/actual/peak/limit)
> Fragment 3:0
> Please, refer to logs for more information.
> [Error Id: 3ffa5b43-0dde-4518-bb5a-ea3aab97f3d4 on servor01:31010]
>   (java.lang.IllegalStateException) Memory was leaked by query. Memory 
> leaked: (256)
> Allocator(frag:3:0) 3000000/256/9337280/30000000000 (res/actual/peak/limit)
>     org.apache.drill.exec.memory.BaseAllocator.close():520
>     org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose():552
>     org.apache.drill.exec.ops.FragmentContextImpl.close():546
>     
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources():386
>     org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup():214
>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():329
>     org.apache.drill.common.SelfCleaningRunnable.run():38
>     java.util.concurrent.ThreadPoolExecutor.runWorker():1149
>     java.util.concurrent.ThreadPoolExecutor$Worker.run():624
>     java.lang.Thread.run():748
>       at 
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123)
>       at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422)
>       at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96)
>       at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273)
>       at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243)
>       at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>       at 
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>       at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>       at 
> io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312)
>       at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>       at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>       at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>       at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
>       at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>       at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
>       at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
>       at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
>       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
>       at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException: Memory was leaked by query. 
> Memory leaked: (256)
> Allocator(frag:3:0) 3000000/256/9337280/30000000000 (res/actual/peak/limit)
>       at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:520)
>       at 
> org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:552)
>       at 
> org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:546)
>       at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:386)
>       at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:214)
>       at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:329)
>       at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>       at .......(:0
> {noformat}
> Note that same request including some other field and excluding use of 
> _parse_url_ works perfectly well.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to