[jira] [Comment Edited] (KUDU-2210) Apache Spark stucks while reading Kudu table.

2017-11-09 Thread Andrew Ya (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16245380#comment-16245380
 ] 

Andrew Ya edited comment on KUDU-2210 at 11/9/17 8:50 AM:
--

We are using spark 1.6.0 and kudu-spark_2.10-1.2.0.jar
Logs look like
{code}
17/11/03 10:15:55 INFO executor.Executor: Running task 93.0 in stage 1.0 (TID 
92)
17/11/03 10:17:27 INFO executor.Executor: Finished task 93.0 in stage 1.0 (TID 
92). 1145 bytes result sent to driver
17/11/03 10:17:27 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
104
17/11/03 10:17:27 INFO executor.Executor: Running task 103.0 in stage 1.0 (TID 
104)
17/11/03 10:19:00 INFO executor.Executor: Finished task 103.0 in stage 1.0 (TID 
104). 1145 bytes result sent to driver
17/11/03 10:19:00 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
113
17/11/03 10:19:00 INFO executor.Executor: Running task 118.0 in stage 1.0 (TID 
113)
17/11/03 10:21:43 INFO executor.Executor: Finished task 118.0 in stage 1.0 (TID 
113). 1145 bytes result sent to driver
17/11/03 10:21:43 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
129
17/11/03 10:21:43 INFO executor.Executor: Running task 131.0 in stage 1.0 (TID 
129)
17/11/03 10:25:03 INFO executor.Executor: Finished task 131.0 in stage 1.0 (TID 
129). 1145 bytes result sent to driver
17/11/03 10:25:03 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
139
17/11/03 10:25:03 INFO executor.Executor: Running task 142.0 in stage 1.0 (TID 
139)
17/11/07 09:54:59 INFO executor.CoarseGrainedExecutorBackend: Driver commanded 
a shutdown
17/11/07 09:54:59 INFO storage.MemoryStore: MemoryStore cleared
{code}

During the execution of spark job I got some exceptions:
{code}
org.apache.kudu.client.NonRecoverableException: Invalid call sequence ID in 
scan request
at 
org.apache.kudu.client.TabletClient.dispatchTSErrorOrReturnException(TabletClient.java:557)
at org.apache.kudu.client.TabletClient.decode(TabletClient.java:488)
at org.apache.kudu.client.TabletClient.decode(TabletClient.java:82)
...
{code}

but the job didn't fail. Rather the failed tasks were re-executed and 
succesfully completed.


was (Author: andrew_ya):
We are using kudu-spark_2.10-1.2.0.jar
Logs look like
{code}
17/11/03 10:15:55 INFO executor.Executor: Running task 93.0 in stage 1.0 (TID 
92)
17/11/03 10:17:27 INFO executor.Executor: Finished task 93.0 in stage 1.0 (TID 
92). 1145 bytes result sent to driver
17/11/03 10:17:27 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
104
17/11/03 10:17:27 INFO executor.Executor: Running task 103.0 in stage 1.0 (TID 
104)
17/11/03 10:19:00 INFO executor.Executor: Finished task 103.0 in stage 1.0 (TID 
104). 1145 bytes result sent to driver
17/11/03 10:19:00 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
113
17/11/03 10:19:00 INFO executor.Executor: Running task 118.0 in stage 1.0 (TID 
113)
17/11/03 10:21:43 INFO executor.Executor: Finished task 118.0 in stage 1.0 (TID 
113). 1145 bytes result sent to driver
17/11/03 10:21:43 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
129
17/11/03 10:21:43 INFO executor.Executor: Running task 131.0 in stage 1.0 (TID 
129)
17/11/03 10:25:03 INFO executor.Executor: Finished task 131.0 in stage 1.0 (TID 
129). 1145 bytes result sent to driver
17/11/03 10:25:03 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
139
17/11/03 10:25:03 INFO executor.Executor: Running task 142.0 in stage 1.0 (TID 
139)
17/11/07 09:54:59 INFO executor.CoarseGrainedExecutorBackend: Driver commanded 
a shutdown
17/11/07 09:54:59 INFO storage.MemoryStore: MemoryStore cleared
{code}

During the execution of spark job I got some exceptions:
{code}
org.apache.kudu.client.NonRecoverableException: Invalid call sequence ID in 
scan request
at 
org.apache.kudu.client.TabletClient.dispatchTSErrorOrReturnException(TabletClient.java:557)
at org.apache.kudu.client.TabletClient.decode(TabletClient.java:488)
at org.apache.kudu.client.TabletClient.decode(TabletClient.java:82)
...
{code}

but the job didn't fail. Rather the failed tasks were re-executed and 
succesfully completed.

> Apache Spark stucks while reading Kudu table.
> -
>
> Key: KUDU-2210
> URL: https://issues.apache.org/jira/browse/KUDU-2210
> Project: Kudu
>  Issue Type: Bug
>  Components: client, perf, spark
>Reporter: Andrew Ya
>
> When I try reading Kudu table with Apache Spark using following code
> {code}
> import org.apache.kudu.spark.kudu._
> import sqlContext.implicits._
> val kuduOptions: Map[String, String] = Map(
> "kudu.table"  -> "test_table", 
> "kudu.master" -> "host1:7051,host2:7051,host3:7051")
> val kuduDF = sqlContext.read.options(kuduOptions).kudu
> kuduDF.registerTempTable("t")
> 

[jira] [Comment Edited] (KUDU-2210) Apache Spark stucks while reading Kudu table.

2017-11-09 Thread Andrew Ya (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16245380#comment-16245380
 ] 

Andrew Ya edited comment on KUDU-2210 at 11/9/17 8:49 AM:
--

We are using kudu-spark_2.10-1.2.0.jar
Logs look like
{code}
17/11/03 10:15:55 INFO executor.Executor: Running task 93.0 in stage 1.0 (TID 
92)
17/11/03 10:17:27 INFO executor.Executor: Finished task 93.0 in stage 1.0 (TID 
92). 1145 bytes result sent to driver
17/11/03 10:17:27 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
104
17/11/03 10:17:27 INFO executor.Executor: Running task 103.0 in stage 1.0 (TID 
104)
17/11/03 10:19:00 INFO executor.Executor: Finished task 103.0 in stage 1.0 (TID 
104). 1145 bytes result sent to driver
17/11/03 10:19:00 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
113
17/11/03 10:19:00 INFO executor.Executor: Running task 118.0 in stage 1.0 (TID 
113)
17/11/03 10:21:43 INFO executor.Executor: Finished task 118.0 in stage 1.0 (TID 
113). 1145 bytes result sent to driver
17/11/03 10:21:43 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
129
17/11/03 10:21:43 INFO executor.Executor: Running task 131.0 in stage 1.0 (TID 
129)
17/11/03 10:25:03 INFO executor.Executor: Finished task 131.0 in stage 1.0 (TID 
129). 1145 bytes result sent to driver
17/11/03 10:25:03 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
139
17/11/03 10:25:03 INFO executor.Executor: Running task 142.0 in stage 1.0 (TID 
139)
17/11/07 09:54:59 INFO executor.CoarseGrainedExecutorBackend: Driver commanded 
a shutdown
17/11/07 09:54:59 INFO storage.MemoryStore: MemoryStore cleared
{code}

During the execution of spark job I got some exceptions:
{code}
org.apache.kudu.client.NonRecoverableException: Invalid call sequence ID in 
scan request
at 
org.apache.kudu.client.TabletClient.dispatchTSErrorOrReturnException(TabletClient.java:557)
at org.apache.kudu.client.TabletClient.decode(TabletClient.java:488)
at org.apache.kudu.client.TabletClient.decode(TabletClient.java:82)
...
{code}

but the job didn't fail. Rather the failed tasks were re-executed and 
succesfully completed.


was (Author: andrew_ya):
We are using kudu-spark_2.10-1.2.0.jar
Logs look like as
{code}
17/11/03 10:15:55 INFO executor.Executor: Running task 93.0 in stage 1.0 (TID 
92)
17/11/03 10:17:27 INFO executor.Executor: Finished task 93.0 in stage 1.0 (TID 
92). 1145 bytes result sent to driver
17/11/03 10:17:27 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
104
17/11/03 10:17:27 INFO executor.Executor: Running task 103.0 in stage 1.0 (TID 
104)
17/11/03 10:19:00 INFO executor.Executor: Finished task 103.0 in stage 1.0 (TID 
104). 1145 bytes result sent to driver
17/11/03 10:19:00 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
113
17/11/03 10:19:00 INFO executor.Executor: Running task 118.0 in stage 1.0 (TID 
113)
17/11/03 10:21:43 INFO executor.Executor: Finished task 118.0 in stage 1.0 (TID 
113). 1145 bytes result sent to driver
17/11/03 10:21:43 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
129
17/11/03 10:21:43 INFO executor.Executor: Running task 131.0 in stage 1.0 (TID 
129)
17/11/03 10:25:03 INFO executor.Executor: Finished task 131.0 in stage 1.0 (TID 
129). 1145 bytes result sent to driver
17/11/03 10:25:03 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
139
17/11/03 10:25:03 INFO executor.Executor: Running task 142.0 in stage 1.0 (TID 
139)
17/11/07 09:54:59 INFO executor.CoarseGrainedExecutorBackend: Driver commanded 
a shutdown
17/11/07 09:54:59 INFO storage.MemoryStore: MemoryStore cleared
{code}

> Apache Spark stucks while reading Kudu table.
> -
>
> Key: KUDU-2210
> URL: https://issues.apache.org/jira/browse/KUDU-2210
> Project: Kudu
>  Issue Type: Bug
>  Components: client, perf, spark
>Reporter: Andrew Ya
>
> When I try reading Kudu table with Apache Spark using following code
> {code}
> import org.apache.kudu.spark.kudu._
> import sqlContext.implicits._
> val kuduOptions: Map[String, String] = Map(
> "kudu.table"  -> "test_table", 
> "kudu.master" -> "host1:7051,host2:7051,host3:7051")
> val kuduDF = sqlContext.read.options(kuduOptions).kudu
> kuduDF.registerTempTable("t")
> sqlContext.sql(" SELECT * FROM t  where id in (,) ").show(50, false)
> {code}
> after completing 95% of tasks the job stucks for more than three days.  The 
> table is partitioned by date and partitions have uneven size. Table have one 
> partition 12 Gb size, about 20 partitions with size between 1 Gb and 3 Gb and 
> some partitions with Mb's and kb's of data.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KUDU-2210) Apache Spark stucks while reading Kudu table.

2017-11-09 Thread Andrew Ya (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16245380#comment-16245380
 ] 

Andrew Ya edited comment on KUDU-2210 at 11/9/17 8:42 AM:
--

We are using kudu-spark_2.10-1.2.0.jar
Logs look like as
{code}
17/11/03 10:15:55 INFO executor.Executor: Running task 93.0 in stage 1.0 (TID 
92)
17/11/03 10:17:27 INFO executor.Executor: Finished task 93.0 in stage 1.0 (TID 
92). 1145 bytes result sent to driver
17/11/03 10:17:27 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
104
17/11/03 10:17:27 INFO executor.Executor: Running task 103.0 in stage 1.0 (TID 
104)
17/11/03 10:19:00 INFO executor.Executor: Finished task 103.0 in stage 1.0 (TID 
104). 1145 bytes result sent to driver
17/11/03 10:19:00 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
113
17/11/03 10:19:00 INFO executor.Executor: Running task 118.0 in stage 1.0 (TID 
113)
17/11/03 10:21:43 INFO executor.Executor: Finished task 118.0 in stage 1.0 (TID 
113). 1145 bytes result sent to driver
17/11/03 10:21:43 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
129
17/11/03 10:21:43 INFO executor.Executor: Running task 131.0 in stage 1.0 (TID 
129)
17/11/03 10:25:03 INFO executor.Executor: Finished task 131.0 in stage 1.0 (TID 
129). 1145 bytes result sent to driver
17/11/03 10:25:03 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
139
17/11/03 10:25:03 INFO executor.Executor: Running task 142.0 in stage 1.0 (TID 
139)
17/11/07 09:54:59 INFO executor.CoarseGrainedExecutorBackend: Driver commanded 
a shutdown
17/11/07 09:54:59 INFO storage.MemoryStore: MemoryStore cleared
{code}


was (Author: andrew_ya):
We are using kudu-spark_2.10-1.2.0.jar
Logs look like 
{code}
17/11/03 10:15:55 INFO executor.Executor: Running task 93.0 in stage 1.0 (TID 
92)
17/11/03 10:17:27 INFO executor.Executor: Finished task 93.0 in stage 1.0 (TID 
92). 1145 bytes result sent to driver
17/11/03 10:17:27 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
104
17/11/03 10:17:27 INFO executor.Executor: Running task 103.0 in stage 1.0 (TID 
104)
17/11/03 10:19:00 INFO executor.Executor: Finished task 103.0 in stage 1.0 (TID 
104). 1145 bytes result sent to driver
17/11/03 10:19:00 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
113
17/11/03 10:19:00 INFO executor.Executor: Running task 118.0 in stage 1.0 (TID 
113)
17/11/03 10:21:43 INFO executor.Executor: Finished task 118.0 in stage 1.0 (TID 
113). 1145 bytes result sent to driver
17/11/03 10:21:43 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
129
17/11/03 10:21:43 INFO executor.Executor: Running task 131.0 in stage 1.0 (TID 
129)
17/11/03 10:25:03 INFO executor.Executor: Finished task 131.0 in stage 1.0 (TID 
129). 1145 bytes result sent to driver
17/11/03 10:25:03 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
139
17/11/03 10:25:03 INFO executor.Executor: Running task 142.0 in stage 1.0 (TID 
139)
17/11/07 09:54:59 INFO executor.CoarseGrainedExecutorBackend: Driver commanded 
a shutdown
17/11/07 09:54:59 INFO storage.MemoryStore: MemoryStore cleared
{code}

> Apache Spark stucks while reading Kudu table.
> -
>
> Key: KUDU-2210
> URL: https://issues.apache.org/jira/browse/KUDU-2210
> Project: Kudu
>  Issue Type: Bug
>  Components: client, perf, spark
>Reporter: Andrew Ya
>
> When I try reading Kudu table with Apache Spark using following code
> {code}
> import org.apache.kudu.spark.kudu._
> import sqlContext.implicits._
> val kuduOptions: Map[String, String] = Map(
> "kudu.table"  -> "test_table", 
> "kudu.master" -> "host1:7051,host2:7051,host3:7051")
> val kuduDF = sqlContext.read.options(kuduOptions).kudu
> kuduDF.registerTempTable("t")
> sqlContext.sql(" SELECT * FROM t  where id in (,) ").show(50, false)
> {code}
> after completing 95% of tasks the job stucks for more than three days.  The 
> table is partitioned by date and partitions have uneven size. Table have one 
> partition 12 Gb size, about 20 partitions with size between 1 Gb and 3 Gb and 
> some partitions with Mb's and kb's of data.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KUDU-2210) Apache Spark stucks while reading Kudu table.

2017-11-09 Thread Andrew Ya (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16245380#comment-16245380
 ] 

Andrew Ya edited comment on KUDU-2210 at 11/9/17 8:41 AM:
--

We are using kudu-spark_2.10-1.2.0.jar
Logs look like 
{code}
17/11/03 10:15:55 INFO executor.Executor: Running task 93.0 in stage 1.0 (TID 
92)
17/11/03 10:17:27 INFO executor.Executor: Finished task 93.0 in stage 1.0 (TID 
92). 1145 bytes result sent to driver
17/11/03 10:17:27 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
104
17/11/03 10:17:27 INFO executor.Executor: Running task 103.0 in stage 1.0 (TID 
104)
17/11/03 10:19:00 INFO executor.Executor: Finished task 103.0 in stage 1.0 (TID 
104). 1145 bytes result sent to driver
17/11/03 10:19:00 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
113
17/11/03 10:19:00 INFO executor.Executor: Running task 118.0 in stage 1.0 (TID 
113)
17/11/03 10:21:43 INFO executor.Executor: Finished task 118.0 in stage 1.0 (TID 
113). 1145 bytes result sent to driver
17/11/03 10:21:43 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
129
17/11/03 10:21:43 INFO executor.Executor: Running task 131.0 in stage 1.0 (TID 
129)
17/11/03 10:25:03 INFO executor.Executor: Finished task 131.0 in stage 1.0 (TID 
129). 1145 bytes result sent to driver
17/11/03 10:25:03 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
139
17/11/03 10:25:03 INFO executor.Executor: Running task 142.0 in stage 1.0 (TID 
139)
17/11/07 09:54:59 INFO executor.CoarseGrainedExecutorBackend: Driver commanded 
a shutdown
17/11/07 09:54:59 INFO storage.MemoryStore: MemoryStore cleared
{code}


was (Author: andrew_ya):
We are using kudu-spark_2.10-1.2.0.jar

> Apache Spark stucks while reading Kudu table.
> -
>
> Key: KUDU-2210
> URL: https://issues.apache.org/jira/browse/KUDU-2210
> Project: Kudu
>  Issue Type: Bug
>  Components: client, perf, spark
>Reporter: Andrew Ya
>
> When I try reading Kudu table with Apache Spark using following code
> {code}
> import org.apache.kudu.spark.kudu._
> import sqlContext.implicits._
> val kuduOptions: Map[String, String] = Map(
> "kudu.table"  -> "test_table", 
> "kudu.master" -> "host1:7051,host2:7051,host3:7051")
> val kuduDF = sqlContext.read.options(kuduOptions).kudu
> kuduDF.registerTempTable("t")
> sqlContext.sql(" SELECT * FROM t  where id in (,) ").show(50, false)
> {code}
> after completing 95% of tasks the job stucks for more than three days.  The 
> table is partitioned by date and partitions have uneven size. Table have one 
> partition 12 Gb size, about 20 partitions with size between 1 Gb and 3 Gb and 
> some partitions with Mb's and kb's of data.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)