[jira] [Comment Edited] (KUDU-2210) Apache Spark stucks while reading Kudu table.
[ https://issues.apache.org/jira/browse/KUDU-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16245380#comment-16245380 ] Andrew Ya edited comment on KUDU-2210 at 11/9/17 8:50 AM: -- We are using spark 1.6.0 and kudu-spark_2.10-1.2.0.jar Logs look like {code} 17/11/03 10:15:55 INFO executor.Executor: Running task 93.0 in stage 1.0 (TID 92) 17/11/03 10:17:27 INFO executor.Executor: Finished task 93.0 in stage 1.0 (TID 92). 1145 bytes result sent to driver 17/11/03 10:17:27 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 104 17/11/03 10:17:27 INFO executor.Executor: Running task 103.0 in stage 1.0 (TID 104) 17/11/03 10:19:00 INFO executor.Executor: Finished task 103.0 in stage 1.0 (TID 104). 1145 bytes result sent to driver 17/11/03 10:19:00 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 113 17/11/03 10:19:00 INFO executor.Executor: Running task 118.0 in stage 1.0 (TID 113) 17/11/03 10:21:43 INFO executor.Executor: Finished task 118.0 in stage 1.0 (TID 113). 1145 bytes result sent to driver 17/11/03 10:21:43 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 129 17/11/03 10:21:43 INFO executor.Executor: Running task 131.0 in stage 1.0 (TID 129) 17/11/03 10:25:03 INFO executor.Executor: Finished task 131.0 in stage 1.0 (TID 129). 1145 bytes result sent to driver 17/11/03 10:25:03 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 139 17/11/03 10:25:03 INFO executor.Executor: Running task 142.0 in stage 1.0 (TID 139) 17/11/07 09:54:59 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown 17/11/07 09:54:59 INFO storage.MemoryStore: MemoryStore cleared {code} During the execution of spark job I got some exceptions: {code} org.apache.kudu.client.NonRecoverableException: Invalid call sequence ID in scan request at org.apache.kudu.client.TabletClient.dispatchTSErrorOrReturnException(TabletClient.java:557) at org.apache.kudu.client.TabletClient.decode(TabletClient.java:488) at org.apache.kudu.client.TabletClient.decode(TabletClient.java:82) ... {code} but the job didn't fail. Rather the failed tasks were re-executed and succesfully completed. was (Author: andrew_ya): We are using kudu-spark_2.10-1.2.0.jar Logs look like {code} 17/11/03 10:15:55 INFO executor.Executor: Running task 93.0 in stage 1.0 (TID 92) 17/11/03 10:17:27 INFO executor.Executor: Finished task 93.0 in stage 1.0 (TID 92). 1145 bytes result sent to driver 17/11/03 10:17:27 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 104 17/11/03 10:17:27 INFO executor.Executor: Running task 103.0 in stage 1.0 (TID 104) 17/11/03 10:19:00 INFO executor.Executor: Finished task 103.0 in stage 1.0 (TID 104). 1145 bytes result sent to driver 17/11/03 10:19:00 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 113 17/11/03 10:19:00 INFO executor.Executor: Running task 118.0 in stage 1.0 (TID 113) 17/11/03 10:21:43 INFO executor.Executor: Finished task 118.0 in stage 1.0 (TID 113). 1145 bytes result sent to driver 17/11/03 10:21:43 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 129 17/11/03 10:21:43 INFO executor.Executor: Running task 131.0 in stage 1.0 (TID 129) 17/11/03 10:25:03 INFO executor.Executor: Finished task 131.0 in stage 1.0 (TID 129). 1145 bytes result sent to driver 17/11/03 10:25:03 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 139 17/11/03 10:25:03 INFO executor.Executor: Running task 142.0 in stage 1.0 (TID 139) 17/11/07 09:54:59 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown 17/11/07 09:54:59 INFO storage.MemoryStore: MemoryStore cleared {code} During the execution of spark job I got some exceptions: {code} org.apache.kudu.client.NonRecoverableException: Invalid call sequence ID in scan request at org.apache.kudu.client.TabletClient.dispatchTSErrorOrReturnException(TabletClient.java:557) at org.apache.kudu.client.TabletClient.decode(TabletClient.java:488) at org.apache.kudu.client.TabletClient.decode(TabletClient.java:82) ... {code} but the job didn't fail. Rather the failed tasks were re-executed and succesfully completed. > Apache Spark stucks while reading Kudu table. > - > > Key: KUDU-2210 > URL: https://issues.apache.org/jira/browse/KUDU-2210 > Project: Kudu > Issue Type: Bug > Components: client, perf, spark >Reporter: Andrew Ya > > When I try reading Kudu table with Apache Spark using following code > {code} > import org.apache.kudu.spark.kudu._ > import sqlContext.implicits._ > val kuduOptions: Map[String, String] = Map( > "kudu.table" -> "test_table", > "kudu.master" -> "host1:7051,host2:7051,host3:7051") > val kuduDF = sqlContext.read.options(kuduOptions).kudu > kuduDF.registerTempTable("t") >
[jira] [Comment Edited] (KUDU-2210) Apache Spark stucks while reading Kudu table.
[ https://issues.apache.org/jira/browse/KUDU-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16245380#comment-16245380 ] Andrew Ya edited comment on KUDU-2210 at 11/9/17 8:49 AM: -- We are using kudu-spark_2.10-1.2.0.jar Logs look like {code} 17/11/03 10:15:55 INFO executor.Executor: Running task 93.0 in stage 1.0 (TID 92) 17/11/03 10:17:27 INFO executor.Executor: Finished task 93.0 in stage 1.0 (TID 92). 1145 bytes result sent to driver 17/11/03 10:17:27 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 104 17/11/03 10:17:27 INFO executor.Executor: Running task 103.0 in stage 1.0 (TID 104) 17/11/03 10:19:00 INFO executor.Executor: Finished task 103.0 in stage 1.0 (TID 104). 1145 bytes result sent to driver 17/11/03 10:19:00 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 113 17/11/03 10:19:00 INFO executor.Executor: Running task 118.0 in stage 1.0 (TID 113) 17/11/03 10:21:43 INFO executor.Executor: Finished task 118.0 in stage 1.0 (TID 113). 1145 bytes result sent to driver 17/11/03 10:21:43 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 129 17/11/03 10:21:43 INFO executor.Executor: Running task 131.0 in stage 1.0 (TID 129) 17/11/03 10:25:03 INFO executor.Executor: Finished task 131.0 in stage 1.0 (TID 129). 1145 bytes result sent to driver 17/11/03 10:25:03 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 139 17/11/03 10:25:03 INFO executor.Executor: Running task 142.0 in stage 1.0 (TID 139) 17/11/07 09:54:59 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown 17/11/07 09:54:59 INFO storage.MemoryStore: MemoryStore cleared {code} During the execution of spark job I got some exceptions: {code} org.apache.kudu.client.NonRecoverableException: Invalid call sequence ID in scan request at org.apache.kudu.client.TabletClient.dispatchTSErrorOrReturnException(TabletClient.java:557) at org.apache.kudu.client.TabletClient.decode(TabletClient.java:488) at org.apache.kudu.client.TabletClient.decode(TabletClient.java:82) ... {code} but the job didn't fail. Rather the failed tasks were re-executed and succesfully completed. was (Author: andrew_ya): We are using kudu-spark_2.10-1.2.0.jar Logs look like as {code} 17/11/03 10:15:55 INFO executor.Executor: Running task 93.0 in stage 1.0 (TID 92) 17/11/03 10:17:27 INFO executor.Executor: Finished task 93.0 in stage 1.0 (TID 92). 1145 bytes result sent to driver 17/11/03 10:17:27 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 104 17/11/03 10:17:27 INFO executor.Executor: Running task 103.0 in stage 1.0 (TID 104) 17/11/03 10:19:00 INFO executor.Executor: Finished task 103.0 in stage 1.0 (TID 104). 1145 bytes result sent to driver 17/11/03 10:19:00 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 113 17/11/03 10:19:00 INFO executor.Executor: Running task 118.0 in stage 1.0 (TID 113) 17/11/03 10:21:43 INFO executor.Executor: Finished task 118.0 in stage 1.0 (TID 113). 1145 bytes result sent to driver 17/11/03 10:21:43 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 129 17/11/03 10:21:43 INFO executor.Executor: Running task 131.0 in stage 1.0 (TID 129) 17/11/03 10:25:03 INFO executor.Executor: Finished task 131.0 in stage 1.0 (TID 129). 1145 bytes result sent to driver 17/11/03 10:25:03 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 139 17/11/03 10:25:03 INFO executor.Executor: Running task 142.0 in stage 1.0 (TID 139) 17/11/07 09:54:59 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown 17/11/07 09:54:59 INFO storage.MemoryStore: MemoryStore cleared {code} > Apache Spark stucks while reading Kudu table. > - > > Key: KUDU-2210 > URL: https://issues.apache.org/jira/browse/KUDU-2210 > Project: Kudu > Issue Type: Bug > Components: client, perf, spark >Reporter: Andrew Ya > > When I try reading Kudu table with Apache Spark using following code > {code} > import org.apache.kudu.spark.kudu._ > import sqlContext.implicits._ > val kuduOptions: Map[String, String] = Map( > "kudu.table" -> "test_table", > "kudu.master" -> "host1:7051,host2:7051,host3:7051") > val kuduDF = sqlContext.read.options(kuduOptions).kudu > kuduDF.registerTempTable("t") > sqlContext.sql(" SELECT * FROM t where id in (,) ").show(50, false) > {code} > after completing 95% of tasks the job stucks for more than three days. The > table is partitioned by date and partitions have uneven size. Table have one > partition 12 Gb size, about 20 partitions with size between 1 Gb and 3 Gb and > some partitions with Mb's and kb's of data. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KUDU-2210) Apache Spark stucks while reading Kudu table.
[ https://issues.apache.org/jira/browse/KUDU-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16245380#comment-16245380 ] Andrew Ya edited comment on KUDU-2210 at 11/9/17 8:42 AM: -- We are using kudu-spark_2.10-1.2.0.jar Logs look like as {code} 17/11/03 10:15:55 INFO executor.Executor: Running task 93.0 in stage 1.0 (TID 92) 17/11/03 10:17:27 INFO executor.Executor: Finished task 93.0 in stage 1.0 (TID 92). 1145 bytes result sent to driver 17/11/03 10:17:27 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 104 17/11/03 10:17:27 INFO executor.Executor: Running task 103.0 in stage 1.0 (TID 104) 17/11/03 10:19:00 INFO executor.Executor: Finished task 103.0 in stage 1.0 (TID 104). 1145 bytes result sent to driver 17/11/03 10:19:00 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 113 17/11/03 10:19:00 INFO executor.Executor: Running task 118.0 in stage 1.0 (TID 113) 17/11/03 10:21:43 INFO executor.Executor: Finished task 118.0 in stage 1.0 (TID 113). 1145 bytes result sent to driver 17/11/03 10:21:43 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 129 17/11/03 10:21:43 INFO executor.Executor: Running task 131.0 in stage 1.0 (TID 129) 17/11/03 10:25:03 INFO executor.Executor: Finished task 131.0 in stage 1.0 (TID 129). 1145 bytes result sent to driver 17/11/03 10:25:03 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 139 17/11/03 10:25:03 INFO executor.Executor: Running task 142.0 in stage 1.0 (TID 139) 17/11/07 09:54:59 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown 17/11/07 09:54:59 INFO storage.MemoryStore: MemoryStore cleared {code} was (Author: andrew_ya): We are using kudu-spark_2.10-1.2.0.jar Logs look like {code} 17/11/03 10:15:55 INFO executor.Executor: Running task 93.0 in stage 1.0 (TID 92) 17/11/03 10:17:27 INFO executor.Executor: Finished task 93.0 in stage 1.0 (TID 92). 1145 bytes result sent to driver 17/11/03 10:17:27 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 104 17/11/03 10:17:27 INFO executor.Executor: Running task 103.0 in stage 1.0 (TID 104) 17/11/03 10:19:00 INFO executor.Executor: Finished task 103.0 in stage 1.0 (TID 104). 1145 bytes result sent to driver 17/11/03 10:19:00 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 113 17/11/03 10:19:00 INFO executor.Executor: Running task 118.0 in stage 1.0 (TID 113) 17/11/03 10:21:43 INFO executor.Executor: Finished task 118.0 in stage 1.0 (TID 113). 1145 bytes result sent to driver 17/11/03 10:21:43 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 129 17/11/03 10:21:43 INFO executor.Executor: Running task 131.0 in stage 1.0 (TID 129) 17/11/03 10:25:03 INFO executor.Executor: Finished task 131.0 in stage 1.0 (TID 129). 1145 bytes result sent to driver 17/11/03 10:25:03 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 139 17/11/03 10:25:03 INFO executor.Executor: Running task 142.0 in stage 1.0 (TID 139) 17/11/07 09:54:59 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown 17/11/07 09:54:59 INFO storage.MemoryStore: MemoryStore cleared {code} > Apache Spark stucks while reading Kudu table. > - > > Key: KUDU-2210 > URL: https://issues.apache.org/jira/browse/KUDU-2210 > Project: Kudu > Issue Type: Bug > Components: client, perf, spark >Reporter: Andrew Ya > > When I try reading Kudu table with Apache Spark using following code > {code} > import org.apache.kudu.spark.kudu._ > import sqlContext.implicits._ > val kuduOptions: Map[String, String] = Map( > "kudu.table" -> "test_table", > "kudu.master" -> "host1:7051,host2:7051,host3:7051") > val kuduDF = sqlContext.read.options(kuduOptions).kudu > kuduDF.registerTempTable("t") > sqlContext.sql(" SELECT * FROM t where id in (,) ").show(50, false) > {code} > after completing 95% of tasks the job stucks for more than three days. The > table is partitioned by date and partitions have uneven size. Table have one > partition 12 Gb size, about 20 partitions with size between 1 Gb and 3 Gb and > some partitions with Mb's and kb's of data. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KUDU-2210) Apache Spark stucks while reading Kudu table.
[ https://issues.apache.org/jira/browse/KUDU-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16245380#comment-16245380 ] Andrew Ya edited comment on KUDU-2210 at 11/9/17 8:41 AM: -- We are using kudu-spark_2.10-1.2.0.jar Logs look like {code} 17/11/03 10:15:55 INFO executor.Executor: Running task 93.0 in stage 1.0 (TID 92) 17/11/03 10:17:27 INFO executor.Executor: Finished task 93.0 in stage 1.0 (TID 92). 1145 bytes result sent to driver 17/11/03 10:17:27 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 104 17/11/03 10:17:27 INFO executor.Executor: Running task 103.0 in stage 1.0 (TID 104) 17/11/03 10:19:00 INFO executor.Executor: Finished task 103.0 in stage 1.0 (TID 104). 1145 bytes result sent to driver 17/11/03 10:19:00 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 113 17/11/03 10:19:00 INFO executor.Executor: Running task 118.0 in stage 1.0 (TID 113) 17/11/03 10:21:43 INFO executor.Executor: Finished task 118.0 in stage 1.0 (TID 113). 1145 bytes result sent to driver 17/11/03 10:21:43 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 129 17/11/03 10:21:43 INFO executor.Executor: Running task 131.0 in stage 1.0 (TID 129) 17/11/03 10:25:03 INFO executor.Executor: Finished task 131.0 in stage 1.0 (TID 129). 1145 bytes result sent to driver 17/11/03 10:25:03 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 139 17/11/03 10:25:03 INFO executor.Executor: Running task 142.0 in stage 1.0 (TID 139) 17/11/07 09:54:59 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown 17/11/07 09:54:59 INFO storage.MemoryStore: MemoryStore cleared {code} was (Author: andrew_ya): We are using kudu-spark_2.10-1.2.0.jar > Apache Spark stucks while reading Kudu table. > - > > Key: KUDU-2210 > URL: https://issues.apache.org/jira/browse/KUDU-2210 > Project: Kudu > Issue Type: Bug > Components: client, perf, spark >Reporter: Andrew Ya > > When I try reading Kudu table with Apache Spark using following code > {code} > import org.apache.kudu.spark.kudu._ > import sqlContext.implicits._ > val kuduOptions: Map[String, String] = Map( > "kudu.table" -> "test_table", > "kudu.master" -> "host1:7051,host2:7051,host3:7051") > val kuduDF = sqlContext.read.options(kuduOptions).kudu > kuduDF.registerTempTable("t") > sqlContext.sql(" SELECT * FROM t where id in (,) ").show(50, false) > {code} > after completing 95% of tasks the job stucks for more than three days. The > table is partitioned by date and partitions have uneven size. Table have one > partition 12 Gb size, about 20 partitions with size between 1 Gb and 3 Gb and > some partitions with Mb's and kb's of data. -- This message was sent by Atlassian JIRA (v6.4.14#64029)