[ 
https://issues.apache.org/jira/browse/DRILL-7164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16820396#comment-16820396
 ] 

ASF GitHub Bot commented on DRILL-7164:
---------------------------------------

sohami commented on pull request #1751: DRILL-7164: KafkaFilterPushdownTest is 
sometimes failing to pattern match correctly
URL: https://github.com/apache/drill/pull/1751#discussion_r276342157
 
 

 ##########
 File path: 
contrib/storage-kafka/src/test/java/org/apache/drill/exec/store/kafka/KafkaFilterPushdownTest.java
 ##########
 @@ -32,10 +33,8 @@
 @Category({KafkaStorageTest.class, SlowTest.class})
 public class KafkaFilterPushdownTest extends KafkaTestBase {
   private static final int NUM_PARTITIONS = 5;
-  private static final String expectedSubStr = "    \"kafkaScanSpec\" : {\n" +
-                                                   "      \"topicName\" : 
\"drill-pushdown-topic\"\n" +
-                                                   "    },\n" +
-                                                   "    \"cost\"";
+  private static final String expectedPattern = 
"KafkaGroupScan.*KafkaScanSpec=KafkaScanSpec.*" +
+                                                   
"topicName=drill-pushdown-topic.*rowcount.*=.*%s";
 
 Review comment:
   this pattern is not correct specially `rowcount.*=.*%s";` where %s is 
replaced by expected rowcount value. Since there is .* before %s this will 
match the longest string which will end with the expected rowcount digit value. 
For example: With below string if the expected rowcount to match is 0, then 
with above pattern match will still return true since `0.0` precedes `memory` 
string. So `.*0` will match `1.0, cumulative cost = {1.0 rows, 2.0 cpu, 0.0 io, 
0.0 network, 0.0` and match will be successful whereas it should have failed 
since actual rowcount = 1.0.
   
   `Scan(table=[[kafka, drill-pushdown-topic]], groupscan=[KafkaGroupScan 
[KafkaScanSpec=KafkaScanSpec [topicName=drill-pushdown-topic], columns=[`**`, 
`kafkaMsgTimestamp`]]]) : rowType = RecordType(DYNAMIC_STAR **, ANY 
kafkaMsgTimestamp): rowcount = 1.0, cumulative cost = {1.0 rows, 2.0 cpu, 0.0 
io, 0.0 network, 0.0 memory}, id = 966`
   
   Can we just replace it with `rowcount = %s` instead ?
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> KafkaFilterPushdownTest is sometimes failing to pattern match correctly.
> ------------------------------------------------------------------------
>
>                 Key: DRILL-7164
>                 URL: https://issues.apache.org/jira/browse/DRILL-7164
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Kafka
>    Affects Versions: 1.16.0
>            Reporter: Hanumath Rao Maduri
>            Assignee: Abhishek Ravi
>            Priority: Major
>             Fix For: 1.17.0
>
>
> On my private build I am hitting kafka storage tests issue intermittently. 
> Here is the issue which I came across.
> {code}
>       at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_91]
> 15:01:39.852 [main] ERROR org.apache.drill.TestReporter - Test Failed (d: 
> -292 B(75.4 KiB), h: -391.1 MiB(240.7 MiB), nh: 824.5 KiB(129.0 MiB)): 
> testPushdownOffsetOneRecordReturnedWithBoundaryConditions(org.apache.drill.exec.store.kafka.KafkaFilterPushdownTest)
> java.lang.AssertionError: Unable to find expected string     "kafkaScanSpec" 
> : {
>       "topicName" : "drill-pushdown-topic"
>     },
>     "cost" in plan: {
>   "head" : {
>     "version" : 1,
>     "generator" : {
>       "type" : "ExplainHandler",
>       "info" : ""
>     },
>     "type" : "APACHE_DRILL_PHYSICAL",
>     "options" : [ {
>       "kind" : "STRING",
>       "accessibleScopes" : "ALL",
>       "name" : "store.kafka.record.reader",
>       "string_val" : 
> "org.apache.drill.exec.store.kafka.decoders.JsonMessageReader",
>       "scope" : "SESSION"
>     }, {
>       "kind" : "BOOLEAN",
>       "accessibleScopes" : "ALL",
>       "name" : "exec.errors.verbose",
>       "bool_val" : true,
>       "scope" : "SESSION"
>     }, {
>       "kind" : "LONG",
>       "accessibleScopes" : "ALL",
>       "name" : "store.kafka.poll.timeout",
>       "num_val" : 5000,
>       "scope" : "SESSION"
>     }, {
>       "kind" : "LONG",
>       "accessibleScopes" : "ALL",
>       "name" : "planner.width.max_per_node",
>       "num_val" : 2,
>       "scope" : "SESSION"
>     } ],
>     "queue" : 0,
>     "hasResourcePlan" : false,
>     "resultMode" : "EXEC"
>   },
>   "graph" : [ {
>     "pop" : "kafka-scan",
>     "@id" : 6,
>     "userName" : "",
>     "kafkaStoragePluginConfig" : {
>       "type" : "kafka",
>       "kafkaConsumerProps" : {
>         "bootstrap.servers" : "127.0.0.1:56524",
>         "group.id" : "drill-test-consumer"
>       },
>       "enabled" : true
>     },
>     "columns" : [ "`**`", "`kafkaMsgOffset`" ],
>     "kafkaScanSpec" : {
>       "topicName" : "drill-pushdown-topic"
>     },
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : {
>       "memoryCost" : 1.6777216E7,
>       "outputRowCount" : 5.0
>     }
>   }, {
>     "pop" : "project",
>     "@id" : 5,
>     "exprs" : [ {
>       "ref" : "`T23¦¦**`",
>       "expr" : "`**`"
>     }, {
>       "ref" : "`kafkaMsgOffset`",
>       "expr" : "`kafkaMsgOffset`"
>     } ],
>     "child" : 6,
>     "outputProj" : false,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : {
>       "memoryCost" : 1.6777216E7,
>       "outputRowCount" : 5.0
>     }
>   }, {
>     "pop" : "filter",
>     "@id" : 4,
>     "child" : 5,
>     "expr" : "equal(`kafkaMsgOffset`, 9) ",
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : {
>       "memoryCost" : 1.6777216E7,
>       "outputRowCount" : 0.75
>     }
>   }, {
>     "pop" : "selection-vector-remover",
>     "@id" : 3,
>     "child" : 4,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : {
>       "memoryCost" : 1.6777216E7,
>       "outputRowCount" : 1.0
>     }
>   }, {
>     "pop" : "project",
>     "@id" : 2,
>     "exprs" : [ {
>       "ref" : "`T23¦¦**`",
>       "expr" : "`T23¦¦**`"
>     } ],
>     "child" : 3,
>     "outputProj" : false,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : {
>       "memoryCost" : 1.6777216E7,
>       "outputRowCount" : 1.0
>     }
>   }, {
>     "pop" : "project",
>     "@id" : 1,
>     "exprs" : [ {
>       "ref" : "`**`",
>       "expr" : "`T23¦¦**`"
>     } ],
>     "child" : 2,
>     "outputProj" : true,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : {
>       "memoryCost" : 1.6777216E7,
>       "outputRowCount" : 1.0
>     }
>   }, {
>     "pop" : "screen",
>     "@id" : 0,
>     "child" : 1,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : {
>       "memoryCost" : 1.6777216E7,
>       "outputRowCount" : 1.0
>     }
>   } ]
> }!
> {code}
> In the earlier checkin (d22e68b83d1d0cc0539d79ae0cb3aa70ae3242ad ) there is a 
> change in the way cost is being represented. It also has the changed the test 
> which I think is not right. The pattern to compare in the plan should be made 
> smart to fix this issue generically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to