RE: Using Map and Basic Operators yield java.lang.ClassCastException (Parquet + Hive + Spark SQL 1.5.0 + Thrift)

Dominic Ricard Thu, 24 Sep 2015 20:00:02 -0700

No, those were just examples on how maps can look like. In my case, the 
key-value is either there or not in the form of the later:


{"key1":{"key2":"value"}}

If key1 is present, then it will contain a tuple of key2:value, value being a 
'int'

I guess, after some testing, that my problem is on how casting a Map value to 
the primitives Float and Double are handled. Handling INT is all good but float 
and double are causing the exception.

Thanks.

Dominic Ricard
Triton Digital

-----Original Message-----
From: Cheng Lian [mailto:lian.cs....@gmail.com] 
Sent: Thursday, September 24, 2015 5:47 PM
To: Dominic Ricard; user@spark.apache.org
Subject: Re: Using Map and Basic Operators yield java.lang.ClassCastException 
(Parquet + Hive + Spark SQL 1.5.0 + Thrift)



On 9/24/15 11:34 AM, Dominic Ricard wrote:
> Hi,
>     I stumbled on the following today. We have Parquet files that 
> expose a column in a Map format. This is very convenient as we have 
> data parts that can vary in time. Not knowing what the data will be, 
> we simply split it in tuples and insert it as a map inside 1 column.
>
> Retrieving the data is very easy. Syntax looks like this:
>
> select column.key1.key2 from table;
>
> Column value look like this:
> {}
> {"key1":"value"}
> {"key1":{"key2":"value"}}
Do you mean that the value type of the map may also vary? The 2nd record has a 
string value, while the 3rd one has another nested map as its value. This isn't 
supported in Spark SQL.
>
> But when trying to do basic operators on that column, I get the 
> following
> error:
>
> query: select (column.key1.key2 / 30 < 1) from table
>
> ERROR processing query/statement. Error Code: 0, SQL state:
> TStatus(statusCode:ERROR_STATUS,
> infoMessages:[*org.apache.hive.service.cli.HiveSQLException:java.lang.ClassCastException:
> org.apache.spark.sql.types.NullType$ cannot be cast to 
> org.apache.spark.sql.types.MapType:26:25,
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation:
> runInternal:SparkExecuteStatementOperation.scala:259,
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation:
> run:SparkExecuteStatementOperation.scala:144,
> org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementIn
> ternal:HiveSessionImpl.java:388, 
> org.apache.hive.service.cli.session.HiveSessionImpl:executeStatement:H
> iveSessionImpl.java:369, 
> sun.reflect.GeneratedMethodAccessor115:invoke::-1,
> sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccess
> orImpl.java:43, java.lang.reflect.Method:invoke:Method.java:497,
> org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessio
> nProxy.java:78, 
> org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSe
> ssionProxy.java:36, 
> org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSession
> Proxy.java:63, 
> java.security.AccessController:doPrivileged:AccessController.java:-2,
> javax.security.auth.Subject:doAs:Subject.java:422,
> org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformat
> ion.java:1628, 
> org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessio
> nProxy.java:59, com.sun.proxy.$Proxy39:executeStatement::-1,
> org.apache.hive.service.cli.CLIService:executeStatement:CLIService.jav
> a:261, 
> org.apache.hive.service.cli.thrift.ThriftCLIService:ExecuteStatement:T
> hriftCLIService.java:486, 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatem
> ent:getResult:TCLIService.java:1313,
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatem
> ent:getResult:TCLIService.java:1298,
> org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39,
> org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39,
> org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddr
> essProcessor.java:56, 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPo
> olServer.java:285, 
> java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.j
> ava:1142, 
> java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.
> java:617, java.lang.Thread:run:Thread.java:745], errorCode:0,
> errorMessage:java.lang.ClassCastException:
> org.apache.spark.sql.types.NullType$ cannot be cast to 
> org.apache.spark.sql.types.MapType), Query:
>
> Trying to apply a Logical or Arithmetic Operator on Map value will 
> yield the error.
>
> The solution I found was to simply cast as a int and change my logical 
> evaluation to equal. (In this case, I wanted to get TRUE if the value 
> was less than 1 and casting it as a float or double was yielding the 
> same error)
>
> select cast(cast(column.key1.key2 as int) / 30 as int) = 0 from table 
> (Yes. Need to cast the remainder for some reason...)
>
> Can anyone shine some light on why map type have problem dealing with 
> basic operators?
>
> Thanks!
>
>
>
>
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Using-Map-and-Basi
> c-Operators-yield-java-lang-ClassCastException-Parquet-Hive-Spark-SQL-
> 1-5-0-Thrift-tp24809.html Sent from the Apache Spark User List mailing 
> list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For 
> additional commands, e-mail: user-h...@spark.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

RE: Using Map and Basic Operators yield java.lang.ClassCastException (Parquet + Hive + Spark SQL 1.5.0 + Thrift)

Reply via email to