---------- Forwarded message ----------
From: Liquan Pei <liquan...@gmail.com>
Date: Thu, Oct 2, 2014 at 3:42 PM
Subject: Re: Spark SQL: ArrayIndexOutofBoundsException
To: SK <skrishna...@gmail.com>


There is only one place you use index 1. One possible issue is that your
may have only one element after your split by "\t".

Can you try to run the following code to make sure every line has at least
two elements?

val tusers = sc.textFile(inp_file)
                           .map(_.split("\t"))
                           .filter( x => x.length < 2)
                           .count()
It should return non zero values if your data contains a line with less
than two values

Liquan

On Thu, Oct 2, 2014 at 3:35 PM, SK <skrishna...@gmail.com> wrote:

> Hi,
>
> I am trying to extract the number of distinct users from a file using Spark
> SQL, but  I am getting  the following error:
>
>
> ERROR Executor: Exception in task 1.0 in stage 8.0 (TID 15)
> java.lang.ArrayIndexOutOfBoundsException: 1
>
>
>  I am following the code in examples/sql/RDDRelation.scala. My code is as
> follows. The error is appearing when it executes the SQL statement. I am
> new
> to  Spark SQL. I would like to know how I can fix this issue.
>
> thanks for your help.
>
>
>      val sql_cxt = new SQLContext(sc)
>      import sql_cxt._
>
>      // read the data using th e schema and create a schema RDD
>      val tusers = sc.textFile(inp_file)
>                            .map(_.split("\t"))
>                            .map(p => TUser(p(0), p(1).trim.toInt))
>
>      // register the RDD as a table
>      tusers.registerTempTable("tusers")
>
>      // get the number of unique users
>      val unique_count = sql_cxt.sql("SELECT COUNT (DISTINCT userid) FROM
> tusers").collect().head.getLong(0)
>
>      println(unique_count)
>
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-ArrayIndexOutofBoundsException-tp15639.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


-- 
Liquan Pei
Department of Physics
University of Massachusetts Amherst



-- 
Liquan Pei
Department of Physics
University of Massachusetts Amherst

Reply via email to